@chemist69: Thanks! And by the way, I forgot to mention the most important feature from a practical perspective: There already is a simple browser viewer available via `df.openInBrowser()`. No plotting yet, but at least handy for quick data inspection.
@jlp765: Handling different types is already possible, as illustrated in the readme example as well. The main difference between to dynamically typed APIs like Pandas is that you once have to tell the compiler about your schema -- but then you can fully benefit from type-safety in the processing pipeline. Let's say your CSV has columns _name_ (string), _age_ (int), _height_ (float), _birthday_ (date), then your code would look like this: const schema = [ col(StrCol, "name"), col(IntCol, "age"), col(FloatCol, "height"), col(DateCol, "birthday") # DateCol not yet implemented, but coming soon ] let df = DF.fromText("data.csv.gz").map(schemaParser(schema, ',')) What happens here, is that the `schemaParser` macro builds a parser proc which takes a string as input and returns a named tuple of type `tuple[name: string, age: int64, height: float, birthday: SomeDateTimeTypeTBD]` (note that this allows to generate highly customized machine code, which is why the parser can be much faster than generic parsers). So yes, the data frame only holds a single type, but that type is heterogenous, and you can extract the individual "columns" back by e.g. `df.map(x => x.name)` giving you a `DataFrame[string]` instead of the full tuple. Having to specify the schema might look tedious from a Pandas perspective. But the big benefit is that you can never get the column names or types wrong. In Pandas you see a lot of code which just says `def preprocess_data(df)`, and it is neither clear what `df` really contains nor what assumptions `preprocess_data` makes on the data. This can be solved by extensive documentation & testing, but is still difficult to maintain in big projects. With a type safe schema the assumptions about the data become explicit in the code, and the compiler can ensure that they are satisfied. Global aggregation is already available. You could do for instance `df.map(x => x.age).mean()` to get the average age. There is also `reduce`/`fold`, which allows to implement custom aggregation functions. What's still missing is `groupBy` and `join`, but they are high priority for me as well, so I hope I can add them soon.