from:"Kanagha"

Re: Reading from HDFS by increasing split size

2017-10-10 Thread Kanagha Kumar

gt; page should tell you ;-) > > On 10. Oct 2017, at 17:53, Kanagha Kumar wrote: > > Thanks for the inputs!! > > I passed in spark.mapred.max.split.size, spark.mapred.min.split.size to > the size I wanted to read. It didn't take any effect. > I also tried passing in spark.

Re: Reading from HDFS by increasing split size

2017-10-10 Thread Kanagha Kumar

ended). >> >> > On 10. Oct 2017, at 09:14, Kanagha Kumar >> wrote: >> > >> > Hi, >> > >> > I'm trying to read a 60GB HDFS file using spark >> textFile("hdfs_file_path", minPartitions). >> > >> > How can

Reading from HDFS by increasing split size

2017-10-10 Thread Kanagha Kumar

Hi, I'm trying to read a 60GB HDFS file using spark textFile("hdfs_file_path", minPartitions). How can I control the no.of tasks by increasing the split size? With default split size of 250 MB, several tasks are created. But I would like to have a specific no.of tasks created while reading from H

Error - Spark reading from HDFS via dataframes - Java

2017-09-30 Thread Kanagha Kumar

Hi, I'm trying to read data from HDFS in spark as dataframes. Printing the schema, I see all columns are being read as strings. I'm converting it to RDDs and creating another dataframe by passing in the correct schema ( how the rows should be interpreted finally). I'm getting the following error:

Re: Replicating a row n times

2017-09-29 Thread Kanagha Kumar

yan guha wrote: > How about using row number for primary key? > > Select row_number() over (), * from table > > On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumar > wrote: > >> Hi, >> >> I'm trying to replicate a single row from a dataset n times and creat

Replicating a row n times

2017-09-28 Thread Kanagha Kumar

in Java and pass it to explode function? Suggestions are helpful. Thanks Kanagha

java IllegalStateException: unread block data Exception - setBlockDataMode

2017-07-11 Thread Kanagha

Hi, I am using spark 2.0.2. I'm not sure what is causing this error to occur. Would be really helpful for any inputs. Appreciate any help in this. Exception caught: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, ...

spark-submit via cluster mode - setting dependencies classpath!

2017-07-09 Thread Kanagha

Hi, I'm trying to run a phoenix spark job via spark cluster mode to a remote yarn cluster. When I do a spark-submit, all jars under SPARK_HOME gets uploaded. I also need to point the remote hbase jar folder location and other dependencies for running the job. Going through the docs, I see setti

Re: Spark 2.0.2 - JdbcRelationProvider does not allow create table as select

2017-07-07 Thread Kanagha Kumar

Hi all, Bumping it again! Please let me know if anyone has faced this in 2.0.x versions. I am using spark 2.0.2 for runtime. Based on the comments, I will open a bug if necessary. Thanks! On Thu, Jul 6, 2017 at 4:00 PM, Kanagha Kumar wrote: > Hi, > > I'm running spark 2.0.2 v

Spark 2.0.2 - JdbcRelationProvider does not allow create table as select

2017-07-06 Thread Kanagha Kumar

Hi, I'm running spark 2.0.2 version and I'm noticing an issue with DataFrameWriter.save() Code: ds.write().format("jdbc").mode("overwrite").options(ImmutableMap.of( "driver", "org.apache.phoenix.jdbc.PhoenixDriver", "url", urlWithTenant, "dbtabl

Spark submit - org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run error

2017-06-22 Thread Kanagha Kumar

Hi, I am *intermittently* seeing this error while doing spark-submit for spark 2.0.2-scala 2.11 version. I see the same issue reported in https://issues.apache.org/jira/browse/SPARK-18343 and it seems to be RESOLVED. I can run successfully most of the time though. Hence I'm unsure if it is becau

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

2017-06-21 Thread Kanagha Kumar

} org.scala-lang *common/sketch/pom.xml* org.apache.spark spark-tags_${scala.binary.version} On Mon, Jun 19, 2017 at 2:25 PM, Kanagha Kumar wrote: > Thanks. But, I am required to do a maven release to Nexus on spark 2.0.2 > built against scal

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

2017-06-19 Thread Kanagha Kumar

(such as spark-tags) are Java projects. Spark doesn't > fix the artifact name and just hard-core 2.11. > > For your issue, try to use `install` rather than `package`. > > On Sat, Jun 17, 2017 at 7:20 PM, Kanagha Kumar > wrote: > >> Hi, >> >> Bumping u

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

2017-06-17 Thread Kanagha Kumar

Hi, Bumping up again! Why does spark modules depend upon scala2.11 versions inspite of changing pom.xmls using ./dev/change-scala-version.sh 2.10. Appreciate any quick help!! Thanks On Fri, Jun 16, 2017 at 2:59 PM, Kanagha Kumar wrote: > Hey all, > > > I'm trying to use Spark

Error while doing mvn release for spark 2.0.2 using scala 2.10

2017-06-16 Thread Kanagha Kumar

. Thanks Kanagha

Re: Reading from HDFS by increasing split size

Re: Reading from HDFS by increasing split size

Reading from HDFS by increasing split size

Error - Spark reading from HDFS via dataframes - Java

Re: Replicating a row n times

Replicating a row n times

java IllegalStateException: unread block data Exception - setBlockDataMode

spark-submit via cluster mode - setting dependencies classpath!

Re: Spark 2.0.2 - JdbcRelationProvider does not allow create table as select

Spark 2.0.2 - JdbcRelationProvider does not allow create table as select

Spark submit - org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run error

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

Re: Error while doing mvn release for spark 2.0.2 using scala 2.10

Error while doing mvn release for spark 2.0.2 using scala 2.10

15 matches

Site Navigation

Mail list logo

Footer information