On Mon, Oct 29, 2018 at 6:38 PM Keith Medcalf <kmedc...@dessus.com> wrote:
> > See the ext/misc/unionvtab.c extension for "reading" a bunch of databases > as if they were a single database. > > https://www.sqlite.org/src/artifact/0b3173f69b8899da Cool, indeed. I also had a look at the CSV file extension: https://www.sqlite.org/src/artifact?udc=1&ln=on&name=65297bcce8d5acd5 Someone has actually come up with an extension to read Parquet files: https://cldellow.com/2018/06/22/sqlite-parquet-vtable.html (...taking a deep breath...) Alright, I'm just gonna say it. How {hard, stupid, useful} do you guys think it would be to write an SQLite extension to add directory-based partitioning on top of the CSV extension and let the OS and filesystem take care of the rest? Something like "day=2018-10-29\source=source_a\bucket1.csv". I've heard people call it "hive-style" partitioning. As long as the size of each individual file remains reasonable, I might as well be happy with CSV files and just read the whole file sequentially. Alternatively, the same partitioning approach might be added to unionvtab, whatever feels simpler. I have no idea how the partitioning semantics could be specified though. I know, that kinda brings us back to my original question at the end of July about database sharding: https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg111250.html Perhaps I'm just too biased towards this approach. Thank you so much for your patience guys! Gerlando _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users