Re: Continue reading dataframe from file despite errors

2017-09-12 Thread Suresh Thalamati
Try the CSV Option(“mode”, "dropmalformed”), that might skip the error records. > On Sep 12, 2017, at 2:33 PM, jeff saremi wrote: > > should have added some of the exception to be clear: > > 17/09/12 14:14:17 ERROR TaskSetManager: Task 0 in stage 15.0 failed 1

Re: Dataset count on database or parquet

2017-02-09 Thread Suresh Thalamati
If you have to get the data into parquet format for other reasons then I think count() on the parquet should be better. If it just the count you need using database sending dbTable = (select count(*) from ) might be quicker, t will avoid unnecessary data transfer from the database to

Re: Dataframe fails to save to MySQL table in spark app, but succeeds in spark shell

2017-01-26 Thread Suresh Thalamati
I notice columns are quoted wit double quotes in the error message ('"user","age","state”)) . By chance did you override the MySQL JDBC dialect, default MySQL identifiers are quoted with ` override def quoteIdentifier(colName: String): String = { s"`$colName`" } Just wondering if the error

Re: Deep learning libraries for scala

2016-09-30 Thread Suresh Thalamati
Tensor frames https://spark-packages.org/package/databricks/tensorframes Hope that helps -suresh > On Sep 30, 2016, at 8:00 PM, janardhan shetty wrote: > > Looking for scala dataframes in particular ? > >

Re: Spark_JDBC_Partitions

2016-09-13 Thread Suresh Thalamati
There is also another jdbc method in data frame reader api o specify your own predicates for each partition. Using this you can control what is included in each partition. val jdbcPartitionWhereClause = Array[String]("id < 100" , "id >=100 and id < 200") val df = spark.read.jdbc(

Re: JDBC SQL Server RDD

2016-05-17 Thread Suresh Thalamati
What is the error you are getting ? At least on the main code line I see JDBCRDD is marked as private[sql]. Simple alternative might be to call sql server using data frame api , and get rdd from data frame. eg: val df =

Re: Microsoft SQL dialect issues

2016-03-15 Thread Suresh Thalamati
You should be able to register your own dialect if the default mappings are not working for your scenario. import org.apache.spark.sql.jdbc JdbcDialects.registerDialect(MyDialect) Please refer to the JdbcDialects to find example of existing default dialect for your database or another

Re: Error reading a CSV

2016-02-24 Thread Suresh Thalamati
Try creating /user/hive/warehouse/ directory if it does not exists , and check it has write permission for the user. Note the lower case ‘user’ in the path. > On Feb 24, 2016, at 2:42 PM, skunkwerk wrote: > > I have downloaded the Spark binary with Hadoop 2.6. > When

[no subject]

2016-01-08 Thread Suresh Thalamati

subscribe

2016-01-04 Thread Suresh Thalamati