Additional missing relevant information:
Im running a transformation, there are no Shuffles occurring and at the
end im performing a lookup of 4 partitions on the driver.
On 10/7/15 11:26 AM, Yadid Ayzenberg wrote:
Hi All,
Im using spark 1.4.1 to to analyze a largish data set (several
should tweak in order to improve the performance?
Or perhaps provide an explanation as to the behavior Im witnessing?
Yadid
)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
On 9/3/15 2:25 PM, Yadid Ayzenberg wrote:
Hi Akhil,
No, it seems I
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
You think that is related to the problem ?
Yadid
On 8/28/15 1:31 AM, Akhil Das wrote:
Is it filling up your disk space? Can you look a bit more in the
executor logs to see whats
?
Yadid
Job aborted due to stage failure: Task 27 in stage 286.0 failed 4 times, most
recent failure: Lost task 27.3 in stage 286.0 (TID 516817, xx.yy.zz.ww):
com.esotericsoftware.kryo.KryoException: com.ning.compress.lzf.LZFException:
Corrupt input data, block did not start with 2 byte
comes to mind is to pass the data to an external
python process via pipe, but this seems cumbersome. Can someone provide
a better alternative?
Yadid
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
Thanks Michael, that makes total sense.
It works perfectly.
Yadid
On Thu, Aug 28, 2014 at 9:19 PM, Michael Armbrust
wrote:
> The comma is just the way the default toString works for Row objects.
> Since SchemaRDDs are also RDDs, you can do arbitrary transformations on
> the Row obj
Hi All,
Is there any way to change the delimiter from being a comma ?
Some of the strings in my data contain commas as well, making it very
difficult to parse the results.
Yadid
60:53855
]
The worker logs and executor logs do not contain errors. Any ideas what
the problem is ?
Yadid
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
te expression. type: UnresolvedAttribute, tree: 'param1
I guess I must be missing a method in the implementation. Any pointers
appreciated.
Yadid
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SPARKSQL-problem-with-implementing-Scala-s-Product-in
Yep, I just issued a pull request.
Yadid
On 5/31/14, 1:25 PM, Patrick Wendell wrote:
1. ctx is an instance of JavaSQLContext but the textFile method is called as
a member of ctx.
According to the API JavaSQLContext does not have such a member, so im
guessing this should be sc instead.
Yeah
ile method is
called as a member of ctx.
According to the API JavaSQLContext does not have such a member, so im
guessing this should be sc instead.
2. In that same code example the object sqlCtx is referenced, but it is
never instantiated in the code.
should this be ctx?
Cheers,
Yadid
An additional option 4) Use SparkContext.addJar() and have the
application ship your jar to all the nodes.
Yadid
On 5/4/14, 4:07 PM, DB Tsai wrote:
If you add the breeze dependency in your build.sbt project, it will
not be available to all the workers.
There are couple options, 1) use sbt
Dear Sparkers,
Has anyone got any insight on this ? I am really stuck.
Yadid
On 4/28/14, 11:28 AM, Yadid Ayzenberg wrote:
Thanks for your answer.
I tried running on a single machine - master and worker on one host. I
get exactly the same results.
Very little CPU activity on the machine in
r(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
On 4/28/14 11:28 AM, Yadid Ayzenberg wrote:
Thanks for your answer.
I tried running on a single machine - master and worker on one host. I
get exactly the same results.
Very little CPU activity on the machine
RDD is created by using the newAPIHadoopRDD.
Any additional info I can provide?
Yadid
On 4/28/14 10:46 AM, Daniel Darabos wrote:
That is quite mysterious, and I do not think we have enough
information to answer. JavaPairRDD.lookup() works fine
on a remote Spark cluster:
$ MASTER=spark
:37 PM, Yadid Ayzenberg wrote:
Some additional information - maybe this rings a bell with someone:
I suspect this happens when the lookup returns more than one value.
For 0 and 1 values, the function behaves as you would expect.
Anyone ?
On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote:
Hi All,
Im
Some additional information - maybe this rings a bell with someone:
I suspect this happens when the lookup returns more than one value.
For 0 and 1 values, the function behaves as you would expect.
Anyone ?
On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote:
Hi All,
Im running a lookup on a
debug this problem ?
Thanks,
Yadid
19 matches
Mail list logo