quet files directly. Spark has
partition-awareness for partitioned directories.
But still, I would like to know if there is a way to leverage
partition-awareness via Hive by using `spark.sql` API?
Any help is highly appreciated!
Thank you.
--
Hao Ren
Yes, it is.
You can define a udf like that.
Basically, it's a udf Int => Int which is a closure contains a non
serializable object.
The latter should cause Task not serializable exception.
Hao
On Mon, Aug 8, 2016 at 5:08 AM, Muthu Jayakumar <bablo...@gmail.com> wrote:
> H
($"key" === 2).show() // *It does not work as expected
(org.apache.spark.SparkException: Task not serializable)*
}
run()
}
Also, I tried collect(), count(), first(), limit(). All of them worked
without non-serializable exceptions.
It seems only filter() throws the exception
?
--
Hao Ren
Data Engineer @ leboncoin
Paris, France
ache/spark/sql/catalyst/expressions/complexTypeExtractors.scala#L49
It seems that the pattern matching does not take UDT into consideration.
Is this an intended feature? If not, I would like to create a PR to fix it.
--
Hao Ren
Data Engineer @ leboncoin
Paris, France
ll/7099/files#diff-668c79317c51f40df870d3404d8a731fR272>);
> perhaps you could push for this to happen by creating a Jira and pinging
> jkbradley and mengxr. Thanks!
>
> On Thu, Sep 17, 2015 at 8:07 AM, Hao Ren <inv...@gmail.com> wrote:
>
>> Working on spark.ml.classification.LogisticRegression.s
(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
--
Hao Ren
Data Engineer @ leboncoin
Paris, France
is highly appreciated.
If you need more info, checkout the jira I created:
https://issues.apache.org/jira/browse/SPARK-8869
On Thu, Jul 16, 2015 at 11:39 AM, Hao Ren inv...@gmail.com wrote:
Given the following code which just reads from s3, then saves files to s3
val