# Micro Libraries Extraction Proposal

## Overview

This document proposes extracting reusable components from the `blackswan-web` VS Code extension into independent micro libraries. Each proposed library is designed to be:

- **Independent**: No VS Code API dependencies
- **Minimal dependencies**: Only relies on Node.js built-ins or small, focused packages
- **Testable in isolation**: Can be unit tested without VS Code infrastructure
- **Reusable**: Suitable for CLI tools, web applications, or other extensions

---

## Proposed Micro Libraries

### 1. `@blackswan/http-utils`

**Purpose**: Core HTTP utility functions for parsing, formatting, and processing HTTP data.

**Source files**:
- `src/shared/utils/httpUtils.ts`

**Extractable functions**:
| Function | Description |
|----------|-------------|
| `generateId()` | Generate unique UUIDs |
| `formatBytes(bytes)` | Format bytes to human-readable string (e.g., "1.5 KB") |
| `formatDuration(ms)` | Format milliseconds to human-readable duration |
| `truncate(str, maxLength)` | Truncate strings with ellipsis |
| `beautifyContent(content, contentType)` | Auto-format JSON/XML content |
| `decodeBase64Body(content, isBase64)` | Decode base64 body with binary detection |
| `createHexDump(buffer, bytesPerLine)` | Create hex dump of binary data |
| `isGzipEncoded(headers)` | Detect gzip encoding from headers |
| `tryDecompressGzip(content, headers)` | Decompress gzip content |
| `processResponseBody(content, headers, mimeType)` | Full response body processing pipeline |

**Dependencies**: Node.js `crypto`, `zlib` (built-ins)

**Example usage**:
```typescript
import { formatBytes, processResponseBody, decodeBase64Body } from '@blackswan/http-utils';

console.log(formatBytes(1536)); // "1.5 KB"
const processed = processResponseBody(gzipContent, headers, 'application/json');
```

---

### 2. `@blackswan/cookie-parser`

**Purpose**: Parse, detect formats, decode, and analyze HTTP cookies.

**Source files**:
- `src/features/cookies/parser/CookieParserServiceImpl.ts`
- `src/features/cookies/parser/CookieParserService.ts`

**Extractable functionality**:
| Feature | Description |
|---------|-------------|
| `parseCookieHeader(header)` | Parse `Cookie:` header into name/value pairs |
| `parseSetCookieHeader(header)` | Parse `Set-Cookie:` header with attributes |
| `detectCookieFormat(value)` | Detect format: `iron-sealed`, `jwt`, `base64`, `json`, `url-encoded`, `plain` |
| `decodeCookieValue(value, format)` | Decode based on detected format |
| `decodeIronSealed(value)` | Parse hapi.js iron-sealed cookies |
| `decodeJWT(value)` | Decode JWT token in cookie |
| `analyzeCookieSecurity(name, value, format, attributes?)` | Security issue detection |

**Supported formats**:
- Iron-sealed cookies (hapi.js `Fe26.2*` format)
- JWT tokens
- Base64 encoded values
- URL encoded values
- JSON values
- Plain text

**Dependencies**: None (pure JavaScript/TypeScript)

**Example usage**:
```typescript
import { detectCookieFormat, decodeCookieValue, analyzeCookieSecurity } from '@blackswan/cookie-parser';

const format = detectCookieFormat('eyJhbGciOiJIUzI1NiIs...');
// => 'jwt'

const decoded = decodeCookieValue(value, format);
// => { header: {...}, payload: {...} }

const issues = analyzeCookieSecurity('session', value, format);
// => ['Session identifier', 'JWT has expired']
```

---

### 3. `@blackswan/jwt-toolkit`

**Purpose**: Complete JWT parsing, modification, and security exploit testing toolkit.

**Source files**:
- `src/features/jwt/parser/JwtParserServiceImpl.ts`
- `src/features/jwt/modifier/JwtModifiers.ts`
- `src/features/jwt/modifier/JwtModifierServiceImpl.ts`

**Extractable functionality**:

#### Parsing
| Function | Description |
|----------|-------------|
| `parseJwt(token)` | Parse JWT into header, payload, signature |
| `encodeJwt(header, payload, signature)` | Encode JWT from parts |
| `base64UrlEncode(str)` | Base64 URL-safe encoding |
| `base64UrlDecode(str)` | Base64 URL-safe decoding |
| `findJwtsInText(text)` | Find all JWTs in arbitrary text |

#### Security Analysis
| Function | Description |
|----------|-------------|
| `analyzeJwtSecurity(jwt)` | Detect security issues (none alg, expired, no exp) |
| `isExpired(jwt)` | Check if JWT is expired |
| `getExpirationDate(jwt)` | Get expiration as Date object |

#### JWT Modifiers (Exploit Testing)
| Modifier | Description | Risk Level |
|----------|-------------|------------|
| `AlgorithmNoneModifier` | Set algorithm to "none" to bypass signature | Critical |
| `AlgorithmConfusionModifier` | RS256 → HS256 confusion attack | Critical |
| `ExpiryExtensionModifier` | Extend token expiration | High |
| `RemoveExpiryModifier` | Remove expiration claim | High |
| `RoleElevationModifier` | Elevate user role/permissions | Critical |
| `SubjectManipulationModifier` | Change subject to impersonate users | Critical |
| `KeyIdInjectionModifier` | SQL injection via kid claim | Critical |
| `JWKHeaderInjectionModifier` | Self-signed token attack | Critical |
| `AddClaimModifier` | Add custom claims | Utility |

**Dependencies**: None (pure JavaScript/TypeScript)

**Example usage**:
```typescript
import { parseJwt, applyModifier, analyzeJwtSecurity } from '@blackswan/jwt-toolkit';

const jwt = parseJwt('eyJhbGciOiJSUzI1NiIs...');
const issues = analyzeJwtSecurity(jwt);
// => ['no expiration']

// Apply algorithm confusion attack
const modified = applyModifier(jwt, 'alg-confusion', { targetAlgorithm: 'HS256' });
// => { header: { alg: 'HS256', ... }, payload: {...}, description: '...' }
```

---

### 4. `@blackswan/http-request-parser`

**Purpose**: Parse raw HTTP request text into structured data.

**Source files**:
- `src/features/xss/active/XssRequestParser.ts`
- `src/features/xss/analyzer/XssTextIndexer.ts`

**Extractable functionality**:
| Feature | Description |
|---------|-------------|
| `parseHttpRequest(text)` | Parse raw HTTP request into structured object |
| `extractQueryString(target)` | Extract query string from request target |
| `detectBodyFormat(contentType, body)` | Detect body format: `json`, `form`, `xml`, `text` |
| `TextIndexer` | Map text offsets to line/character positions |

**Parsed request structure**:
```typescript
interface ParsedHttpRequest {
  method: string;
  target: string;
  httpVersion?: string;
  headers: Array<{ name: string; value: string; valueRange: TextRange }>;
  body: string | null;
  bodyFormat: 'json' | 'form' | 'xml' | 'text' | 'unknown';
  bodyRange?: TextRange;
  queryString: string | null;
}
```

**Dependencies**: None

**Example usage**:
```typescript
import { parseHttpRequest } from '@blackswan/http-request-parser';

const request = parseHttpRequest(`GET /api/users?id=123 HTTP/1.1
Host: example.com
Cookie: session=abc123

`);

console.log(request.method); // 'GET'
console.log(request.queryString); // 'id=123'
console.log(request.headers); // [{ name: 'Host', value: 'example.com', ... }, ...]
```

---

### 5. `@blackswan/xss-injection-extractor`

**Purpose**: Extract potential XSS injection points from HTTP requests.

**Source files**:
- `src/features/xss/analyzer/XssInjectionPointExtractor.ts`
- `src/features/xss/types/`

**Extractable functionality**:
| Feature | Description |
|---------|-------------|
| `extractInjectionPoints(requestText)` | Extract all injection points |
| `extractQueryPoints(...)` | Extract URL query parameter injection points |
| `extractHeaderPoints(...)` | Extract header injection points |
| `extractCookiePoints(...)` | Extract cookie injection points |
| `extractBodyPoints(...)` | Extract body injection points (form/JSON) |

**Injection point structure**:
```typescript
interface XssInjectionPoint {
  location: 'query' | 'header' | 'cookie' | 'body';
  name: string;
  value: string;
  range: TextRange;
}
```

**Dependencies**: `@blackswan/http-request-parser`

**Example usage**:
```typescript
import { extractInjectionPoints } from '@blackswan/xss-injection-extractor';

const points = extractInjectionPoints(`POST /search HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Cookie: user=admin

query=<script>&page=1`);

// => [
//   { location: 'cookie', name: 'user', value: 'admin', ... },
//   { location: 'body', name: 'query', value: '<script>', ... },
//   { location: 'body', name: 'page', value: '1', ... }
// ]
```

---

### 6. `@blackswan/url-route-utils`

**Purpose**: URL parsing and filesystem-safe path conversion utilities.

**Source files**:
- `src/proxy/storage/routeUtils.ts`

**Extractable functionality**:
| Function | Description |
|----------|-------------|
| `urlToRoutePath(url)` | Convert URL to filesystem-safe route path |
| `routePathToUrlPath(routePath)` | Convert route path back to URL path |
| `extractHost(url)` | Extract hostname from URL |
| `encodePathSegment(segment)` | Encode path segment for filesystem safety |
| `decodePathSegment(segment)` | Decode filesystem-safe segment |
| `buildContainerPath(url, method, timestamp)` | Build storage container path |
| `parseContainerPath(path)` | Parse container path into components |
| `matchesRouteFilter(path, filter)` | Match path against wildcard filter |
| `normalizeUrl(url)` | Normalize URL for consistent comparison |

**Dependencies**: None (uses built-in `URL`)

**Example usage**:
```typescript
import { urlToRoutePath, buildContainerPath, matchesRouteFilter } from '@blackswan/url-route-utils';

const routePath = urlToRoutePath('https://api.example.com/api/v1/users/123');
// => 'api/v1/users/123'

const containerPath = buildContainerPath('https://api.example.com/users', 'POST', Date.now());
// => 'api.example.com/users/POST/1704384000000'

matchesRouteFilter('/api/v1/users/123', '/api/*/users/*');
// => true
```

---

### 7. `@blackswan/text-utils`

**Purpose**: Text processing utilities for analysis and truncation.

**Source files**:
- `src/proxy/analysis/utils/textTruncator.ts`
- `src/features/xss/analyzer/XssTextIndexer.ts`

**Extractable functionality**:
| Feature | Description |
|---------|-------------|
| `TextTruncator` | Truncate large text blobs with configurable limit |
| `TextIndexer` | Map character offsets to line/column positions |
| `positionAt(offset)` | Get line/character from offset |
| `rangeFromOffsets(start, end)` | Get text range from offsets |

**Dependencies**: None

**Example usage**:
```typescript
import { TextTruncator, TextIndexer } from '@blackswan/text-utils';

const truncator = new TextTruncator(1000);
const truncated = truncator.truncate(veryLongText);
// => '...first 1000 chars... [truncated]'

const indexer = new TextIndexer(multilineText);
const pos = indexer.positionAt(150);
// => { line: 3, character: 25, offset: 150 }
```

---

### 8. `@blackswan/discovery-graph`

**Purpose**: Build graph structures from subdomain/technology discovery results.

**Source files**:
- `src/discovery/DiscoveryGraphBuilder.ts`
- `src/discovery/types.ts`

**Extractable functionality**:
| Feature | Description |
|---------|-------------|
| `DiscoveryGraphBuilder` | Build react-flow compatible graph from discovery results |
| `addSubdomain(subdomain)` | Add subdomain node to graph |
| `addProbedResult(result)` | Add probed subdomain with technologies |
| `build()` | Build final graph structure |

**Graph structure**:
```typescript
interface DiscoveryGraph {
  nodes: DiscoveryNode[];  // domain, subdomain, technology, resource nodes
  edges: DiscoveryEdge[];  // relationships between nodes
}
```

**Dependencies**: None

**Example usage**:
```typescript
import { DiscoveryGraphBuilder } from '@blackswan/discovery-graph';

const builder = new DiscoveryGraphBuilder();
builder.initialize('example.com');
builder.addSubdomain({ host: 'api.example.com', ip: '1.2.3.4', isActive: true, source: 'dns' });
builder.addProbedResult({
  subdomain: { host: 'api.example.com', ... },
  technologies: [{ name: 'nginx', version: '1.21', category: 'web-server' }],
  resources: [...]
});

const graph = builder.build();
// => { nodes: [...], edges: [...] }
```

---

## Implementation Recommendations

### Directory Structure
```
packages/
├── http-utils/
│   ├── src/
│   │   └── index.ts
│   ├── tests/
│   ├── package.json
│   └── tsconfig.json
├── cookie-parser/
│   └── ...
├── jwt-toolkit/
│   └── ...
└── ...
```

### Package Configuration
Each package should:
1. Target ES2020+ with CommonJS and ESM outputs
2. Include TypeScript declarations
3. Have zero or minimal runtime dependencies
4. Include comprehensive unit tests
5. Follow semantic versioning

### Migration Path
1. Extract core functionality (no VS Code imports)
2. Add comprehensive tests
3. Publish as internal/private packages first
4. Update `blackswan-web` to consume the packages
5. Iterate based on usage patterns
6. Consider public release if useful externally

---

## Priority Order

| Priority | Library | Rationale |
|----------|---------|-----------|
| 1 | `@blackswan/jwt-toolkit` | High standalone value, security testing utility |
| 2 | `@blackswan/cookie-parser` | High standalone value, useful for any HTTP tool |
| 3 | `@blackswan/http-utils` | Foundational utilities, widely reusable |
| 4 | `@blackswan/http-request-parser` | Foundational for XSS tools |
| 5 | `@blackswan/xss-injection-extractor` | Depends on request parser |
| 6 | `@blackswan/url-route-utils` | Useful for routing/storage |
| 7 | `@blackswan/text-utils` | Simple utilities |
| 8 | `@blackswan/discovery-graph` | More specialized use case |

---

## Summary

| Library | Lines of Code (est.) | Dependencies | Complexity |
|---------|---------------------|--------------|------------|
| `@blackswan/http-utils` | ~200 | Node built-ins | Low |
| `@blackswan/cookie-parser` | ~400 | None | Medium |
| `@blackswan/jwt-toolkit` | ~600 | None | Medium |
| `@blackswan/http-request-parser` | ~150 | None | Low |
| `@blackswan/xss-injection-extractor` | ~200 | http-request-parser | Low |
| `@blackswan/url-route-utils` | ~200 | None | Low |
| `@blackswan/text-utils` | ~100 | None | Low |
| `@blackswan/discovery-graph` | ~200 | None | Low |

**Total estimated extractable code**: ~2,050 lines across 8 micro libraries.

All proposed libraries have been designed to be VS Code-agnostic and can be used in:
- CLI tools
- Web applications
- Other VS Code extensions
- Node.js services
- Browser environments (most packages)
