Hi,
I guess this is not a CSV-datasource specific problem.
Does loading any file (eg. textFile()) work as well?
I think this is related with this thread,
http://apache-spark-user-list.1001560.n3.nabble.com/Error-while-running-example-scala-application-using-spark-submit-td10056.html
.
Hi All,
I have an RDD having the data in the following form :
tempRDD: RDD[(String, (String, String))]
(brand , (product, key))
("amazon",("book1","tech"))
("eBay",("book1","tech"))
("barns",("book","tech"))
("amazon",("book2","tech"))
I would like to group the data by Brand and would
Hi,
So yeah, I know that Spark jobs running on a Hadoop cluster will inherit its
security from the underlying YARN job.
However… that’s not really saying much when you think about some use cases.
Like using the thrift service …
I’m wondering what else is new and what people have been
This is great feedback to hear. I think there was discussion about moving
Pipelines outside of ML at some point, but I'll have to spend more time to
dig it up.
In the meantime, I thought I'd mention this JIRA here in case people have
feedback:
https://issues.apache.org/jira/browse/SPARK-14033
+1.
Tom
On Tuesday, March 29, 2016 1:17 PM, Reynold Xin wrote:
They work.
On Tue, Mar 29, 2016 at 10:01 AM, Koert Kuipers wrote:
if scala prior to sbt 2.10.4 didn't support java 8, does that mean that 3rd
party scala libraries compiled with a
They work.
On Tue, Mar 29, 2016 at 10:01 AM, Koert Kuipers wrote:
> if scala prior to sbt 2.10.4 didn't support java 8, does that mean that
> 3rd party scala libraries compiled with a scala version < 2.10.4 might not
> work on java 8?
>
>
> On Mon, Mar 28, 2016 at 7:06 PM,
Hi All,
I have written a spark program on my dev box ,
IDE:Intellij
scala version:2.11.7
spark verison:1.6.1
run fine from IDE, by providing proper input and output paths including
master.
But when i try to deploy the code in my cluster made of below,
Spark
if scala prior to sbt 2.10.4 didn't support java 8, does that mean that 3rd
party scala libraries compiled with a scala version < 2.10.4 might not work
on java 8?
On Mon, Mar 28, 2016 at 7:06 PM, Kostas Sakellis
wrote:
> Also, +1 on dropping jdk7 in Spark 2.0.
>
> Kostas
>
Hi, I'm interested in figuring out how the Python API for Spark works,
I've came to the following conclusion and want to share this with the
community; could be of use in the PySpark docs here, specifically the
"Execution and pipelining part".
Any sanity checking would be much appreciated,
while sonatype are utterly strict about the org.apache namespace (it guarantees
that all such artifacts have come through the ASF release process, ideally
including code-signing), nobody checks the org.apache internals, or worries too
much about them. Note that spark itself has some bits of
Hi,
I have a web service that provides rest api to train random forest algo.
I train random forest on a 5 nodes spark cluster with enough memory -
everything is cached (~22 GB).
On a small datasets up to 100k samples everything is fine, but with the
biggest one (400k samples and ~70k features)
11 matches
Mail list logo