date:20140628

Re: jackson-core-asl jar (1.8.8 vs 1.9.x) conflict with the spark-sql (version 1.x)

2014-06-28 Thread Paul Brown

Hi, Mans -- Both of those versions of Jackson are pretty ancient. Do you know which of the Spark dependencies is pulling them in? It would be good for us (the Jackson, Woodstox, etc., folks) to see if we can get people to upgrade to more recent versions of Jackson. -- Paul —

Re: Distribute data from Kafka evenly on cluster

2014-06-28 Thread Mayur Rustagi

how abou this? https://groups.google.com/forum/#!topic/spark-users/ntPQUZFJt4M Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi https://twitter.com/mayur_rustagi On Sat, Jun 28, 2014 at 10:19 AM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, I have a

collect on partitions get very slow near the last few partitions.

2014-06-28 Thread Sung Hwan Chung

I'm doing something like this: rdd.groupBy.map().collect() The work load on final map is pretty much evenly distributed. When collect happens, say on 60 partitions, the first 55 or so partitions finish very quickly say within 10 seconds. However, the last 5, particularly the very last one,

Re: collect on partitions get very slow near the last few partitions.

2014-06-28 Thread Sung Hwan Chung

I'm finding the following messages in the driver. Can this potentially have anything to do with these drastic slowdowns? 14/06/28 00:00:17 INFO ShuffleBlockManager: Could not find files for shuffle 8 for deleting 14/06/28 00:00:17 INFO ContextCleaner: Cleaned shuffle 8 14/06/28 00:00:17 INFO

Re: HBase 0.96+ with Spark 1.0+

2014-06-28 Thread Sean Owen

This sounds like an instance of roughly the same item as in https://issues.apache.org/jira/browse/SPARK-1949 Have a look at adding that exclude to see if it works. On Fri, Jun 27, 2014 at 10:21 PM, Stephen Boesch java...@gmail.com wrote: The present trunk is built and tested against HBase 0.94.

Re: HBase 0.96+ with Spark 1.0+

2014-06-28 Thread Stephen Boesch

Thanks Sean. I had actually already added exclusion rule for org.mortbay.jetty - and that had not resolved it. Just in case I used your precise formulation: val excludeMortbayJetty = ExclusionRule(organization = org.mortbay.jetty) .. ,(org.apache.spark % spark-core_2.10 % sparkVersion

Re: HBase 0.96+ with Spark 1.0+

2014-06-28 Thread Siyuan he

Hi Stephen, I am using spark1.0+ HBase0.96.2. This is what I did: 1) rebuild spark using: mvn -Dhadoop.version=2.3.0 -Dprotobuf.version=2.5.0 -DskipTests clean package 2) In spark-env.sh, set SPARK_CLASSPATH = /path-to/hbase-protocol-0.96.2-hadoop2.jar Hopefully it can help. Siyuan On Sat, Jun

Re: jackson-core-asl jar (1.8.8 vs 1.9.x) conflict with the spark-sql (version 1.x)

2014-06-28 Thread M Singh

Hi Paul: Here are the dependencies in spark 1.1.0-snapshot that are pulling in org.codehaus.jackson:jackson-core-asl 1.8 and 1.9 jar. 1.9 com.twitter:parquet-hadoop:jar:1.4.3 org.apache.avro:avro:jar:1.7.6 1.8 org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT

Alternative to checkpointing and materialization for truncating lineage in high iteration jobs

2014-06-28 Thread Nilesh Chakraborty

Hello, In a thread about java.lang.StackOverflowError when calling count() [1] I saw Tathagata Das share an interesting approach for truncating RDD lineage - this helps prevent StackOverflowErrors in high iteration jobs while avoiding the disk-writing performance penalty. Here's an excerpt from

Re: Alternative to checkpointing and materialization for truncating lineage in high iteration jobs

2014-06-28 Thread Baoxu Shi(Dash)

I’m facing the same situation. It would be great if someone could provide a code snippet as example. On Jun 28, 2014, at 12:36 PM, Nilesh Chakraborty nil...@nileshc.com wrote: Hello, In a thread about java.lang.StackOverflowError when calling count() [1] I saw Tathagata Das share an

Re: jackson-core-asl jar (1.8.8 vs 1.9.x) conflict with the spark-sql (version 1.x)

Re: Distribute data from Kafka evenly on cluster

collect on partitions get very slow near the last few partitions.

Re: collect on partitions get very slow near the last few partitions.

Re: HBase 0.96+ with Spark 1.0+

Re: HBase 0.96+ with Spark 1.0+

Re: HBase 0.96+ with Spark 1.0+

Re: jackson-core-asl jar (1.8.8 vs 1.9.x) conflict with the spark-sql (version 1.x)

Alternative to checkpointing and materialization for truncating lineage in high iteration jobs

Re: Alternative to checkpointing and materialization for truncating lineage in high iteration jobs

10 matches

Site Navigation

Mail list logo

Footer information