Re: Issues with large schema tables

2018-03-08 Thread Gourav Sengupta
Hi Ballas, in Data Science terms you have 4500 variables without correlations or which are independent of each other. In Data Modelling terms you have an entity with 4500 properties. I have worked on hair splitting financial products, even they do not have properties of a financial product with

Issues with large schema tables

2018-03-07 Thread Ballas, Ryan W
Hello All, Our team is having a lot of issues with the Spark API particularly with large schema tables. We currently have a program written in Scala that utilizes the Apache spark API to create two tables from raw files. We have one particularly very large raw data file that contains around