gt; page should tell you ;-)
>
> On 10. Oct 2017, at 17:53, Kanagha Kumar wrote:
>
> Thanks for the inputs!!
>
> I passed in spark.mapred.max.split.size, spark.mapred.min.split.size to
> the size I wanted to read. It didn't take any effect.
> I also tried passing in spark.
ended).
>>
>> > On 10. Oct 2017, at 09:14, Kanagha Kumar
>> wrote:
>> >
>> > Hi,
>> >
>> > I'm trying to read a 60GB HDFS file using spark
>> textFile("hdfs_file_path", minPartitions).
>> >
>> > How can
Hi,
I'm trying to read a 60GB HDFS file using spark textFile("hdfs_file_path",
minPartitions).
How can I control the no.of tasks by increasing the split size? With
default split size of 250 MB, several tasks are created. But I would like
to have a specific no.of tasks created while reading from H
Hi,
I'm trying to read data from HDFS in spark as dataframes. Printing the
schema, I see all columns are being read as strings. I'm converting it to
RDDs and creating another dataframe by passing in the correct schema ( how
the rows should be interpreted finally).
I'm getting the following error:
yan guha wrote:
> How about using row number for primary key?
>
> Select row_number() over (), * from table
>
> On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumar
> wrote:
>
>> Hi,
>>
>> I'm trying to replicate a single row from a dataset n times and creat
in Java and pass it to
explode function? Suggestions are helpful.
Thanks
Kanagha
Hi,
I am using spark 2.0.2. I'm not sure what is causing this error to
occur. Would be really helpful for any inputs. Appreciate any help in
this.
Exception caught: Job aborted due to stage failure: Task 0 in stage
0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0
(TID 3, ...
Hi,
I'm trying to run a phoenix spark job via spark cluster mode to a remote
yarn cluster.
When I do a spark-submit, all jars under SPARK_HOME gets uploaded.
I also need to point the remote hbase jar folder location and other
dependencies for running the job.
Going through the docs, I see setti
Hi all,
Bumping it again! Please let me know if anyone has faced this in 2.0.x
versions. I am using spark 2.0.2 for runtime. Based on the comments, I will
open a bug if necessary. Thanks!
On Thu, Jul 6, 2017 at 4:00 PM, Kanagha Kumar
wrote:
> Hi,
>
> I'm running spark 2.0.2 v
Hi,
I'm running spark 2.0.2 version and I'm noticing an issue with
DataFrameWriter.save()
Code:
ds.write().format("jdbc").mode("overwrite").options(ImmutableMap.of(
"driver", "org.apache.phoenix.jdbc.PhoenixDriver",
"url", urlWithTenant,
"dbtabl
Hi,
I am *intermittently* seeing this error while doing spark-submit for spark
2.0.2-scala 2.11 version.
I see the same issue reported in
https://issues.apache.org/jira/browse/SPARK-18343 and it seems to be
RESOLVED. I can run successfully most of the time though. Hence I'm unsure
if it is becau
}
org.scala-lang
*common/sketch/pom.xml*
org.apache.spark
spark-tags_${scala.binary.version}
On Mon, Jun 19, 2017 at 2:25 PM, Kanagha Kumar
wrote:
> Thanks. But, I am required to do a maven release to Nexus on spark 2.0.2
> built against scal
(such as spark-tags) are Java projects. Spark doesn't
> fix the artifact name and just hard-core 2.11.
>
> For your issue, try to use `install` rather than `package`.
>
> On Sat, Jun 17, 2017 at 7:20 PM, Kanagha Kumar
> wrote:
>
>> Hi,
>>
>> Bumping u
Hi,
Bumping up again! Why does spark modules depend upon scala2.11 versions
inspite of changing pom.xmls using ./dev/change-scala-version.sh 2.10.
Appreciate any quick help!!
Thanks
On Fri, Jun 16, 2017 at 2:59 PM, Kanagha Kumar
wrote:
> Hey all,
>
>
> I'm trying to use Spark
.
Thanks
Kanagha
15 matches
Mail list logo