Documentation

Everything you need to get started with CYD — from the dashboard to the API.

Getting Started

›Quick Start Guide
›Creating an Account
›Uploading Your First Dataset

Dashboard

›Uploading Files
›Visualizing Data
›Cleaning Operations
›Exporting Results

REST API

›Authentication
›POST /api/v1/upload
›POST /api/v1/clean
›GET /api/v1/download/:id
›Rate Limits

API Quick Reference

# Upload a file
curl -X POST YOUR_API_BASE_URL/upload \
  -H "Authorization: Bearer <token>" \
  -F "file=@data.csv"

# Clean a dataset
curl -X POST YOUR_API_BASE_URL/clean \
  -H "Authorization: Bearer <token>" \
  -d '{"dataset_id": "abc123", "remove_duplicates": true, "fill_nulls": "mean"}'

# Download cleaned file
curl YOUR_API_BASE_URL/download/<dataset_id> \
  -H "Authorization: Bearer <token>" -o cleaned.csv

Quick Start Guide

Sign up on the CYD homepage, then drag and drop any CSV file onto the dashboard. CYD auto-detects columns, types, and delimiters. Switch between the Preview, Charts, and Clean tabs to explore, visualize, and clean your data. Export the result as CSV, JSON, or Excel when you are done.

Creating an Account

Click "Get Started Free" on the homepage. Enter your name, email, and a password (minimum 8 characters). You will receive a confirmation email — click the link to activate your account. You can also sign up with Google or GitHub OAuth.

Uploading Your First Dataset

From the dashboard, click the "Upload" button or drag a file onto the drop zone. Supported formats: CSV, TSV, Excel (.xlsx), JSON, and Parquet. Maximum file size depends on your plan. CYD will parse the file and show a preview within seconds.

Uploading Files

The upload panel accepts files via drag-and-drop or the file picker. CYD auto-detects the delimiter (comma, tab, semicolon, pipe) and encoding (UTF-8, UTF-16, Latin-1). If auto-detection fails, you can manually specify the delimiter and encoding in the upload settings.

Visualizing Data

The Charts tab auto-generates visualizations for each column. Numeric columns get histograms and box plots. Categorical columns get bar charts. Date columns get time series plots. Click any chart to expand it and see detailed statistics. Use the chart type selector to switch between visualization types.

Cleaning Operations

The Clean tab provides one-click operations: Remove Duplicates (exact or fuzzy match), Handle Nulls (drop, fill with mean/median/mode/custom), Trim Whitespace, Standardize Dates, Convert Types, and Remove Outliers (IQR or Z-score method). Each operation shows a preview before applying. All operations are logged in the audit trail and can be undone.

Exporting Results

Click "Export" to download your cleaned dataset. Choose from CSV, JSON, or Excel (.xlsx) format. The export includes a cleaning audit log showing every operation applied. Pro and Team users can also share datasets via a unique link or download via the API.

API Authentication

All API requests require a Bearer token. Generate an API key from Dashboard > Settings > API Keys. Include it in the Authorization header: "Authorization: Bearer YOUR_API_KEY". Tokens can be revoked from the dashboard at any time.

POST /api/v1/upload

Upload a file for processing. Send a multipart/form-data request with the file in the "file" field. Returns a JSON response with dataset_id, column_count, row_count, and detected_types. Maximum file size depends on your plan. Response time varies based on file size.

POST /api/v1/clean

Apply cleaning operations to an uploaded dataset. Send a JSON body with dataset_id and cleaning options: remove_duplicates (boolean), fill_nulls ("mean", "median", "mode", "drop", or a custom value), trim_whitespace (boolean), standardize_dates (target format string), remove_outliers ("iqr" or "zscore"). Returns the updated row count and a summary of changes.

GET /api/v1/download/:id

Download a cleaned dataset by its ID. Optional query parameter: format=csv|json|xlsx (default: csv). The response is the file content with appropriate Content-Type and Content-Disposition headers. Datasets are available for download for a limited time after creation.

Rate Limits

Rate limits vary by plan. Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every API response. Exceeding the limit returns HTTP 429. Check your dashboard for your current plan's rate limit details.

CSV & TSV

CYD supports comma-separated (CSV) and tab-separated (TSV) files. Auto-detection handles commas, tabs, semicolons, and pipe delimiters. Quoted fields, escaped quotes, and multi-line values are fully supported. Encoding auto-detection covers UTF-8, UTF-16, Latin-1, and Windows-1252.

Excel (.xlsx)

Upload Excel workbooks in .xlsx format. CYD reads the first sheet by default — you can select a different sheet from the upload settings. Formulas are evaluated to their current values. Merged cells are expanded. Date cells are parsed using Excel's internal date format.

JSON

CYD supports JSON files containing an array of objects (each object becomes a row). Nested objects are flattened using dot notation (e.g., address.city). Arrays within objects are joined as comma-separated strings. The JSON must be valid — CYD will show a parse error with the line number if it is malformed.

Parquet

Apache Parquet files are supported for both upload and export. CYD reads all row groups and columns. Nested schemas are flattened. Parquet export preserves column types for efficient downstream processing. This format is ideal for large datasets as it offers significant compression.