Fwd: Feature Generation for Large datasets composed of many time series

2017-12-14 Thread julio . cesare
Hi dear spark community ! I want to create a lib which generates features for potentially very large datasets, so I believe spark could be a nice tool for that. Let me explain what I need to do : Each file 'F' of my dataset is composed of at least : - an id ( string or int ) - a timestamp (

Feature Generation for Large datasets composed of many time series

2017-07-19 Thread julio . cesare
Hello, I want to create a lib which generates features for potentially very large datasets. Each file 'F' of my dataset is composed of at least : - an id ( string or int ) - a timestamp ( or a long value ) - a value ( int or string ) I want my tool to : - compute aggregate function for many