Unsubscribe

2018-02-11 Thread Archit Thakur
Unsubscribe

Re: Spark streaming job hangs

2015-12-01 Thread Archit Thakur
Which version of spark you are runinng? Have you created Kafka-Directstream ? I am asking coz you might / might not be using receivers. Also, When you say hangs, you mean there is no other log after this and process still up? Or do you mean, it kept on adding the jobs but did nothing else. (I am

Re: using multiple dstreams together (spark streaming)

2015-09-28 Thread Archit Thakur
@TD: Doesn't transformWith need both of the DStreams to be of same slideDuration. [Spark Version: 1.3.1] -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/using-multiple-dstreams-together-spark-streaming-tp9947p24839.html Sent from the Apache Spark User List

Re: Parquet number of partitions

2015-05-07 Thread Archit Thakur
Hi. No. of partitions are determined by the RDD it uses in the plan it creates. It uses NewHadoopRDD which gives partitions by getSplits of input format it is using. It uses FilteringParquetRowInputFormat subclass of ParquetInputFormat. To change the no of partitions write a new input format and

Re: Number of input partitions in SparkContext.sequenceFile

2015-05-02 Thread Archit Thakur
Hi, How did u check no of splits in ur file. Did i run ur mr job or calculated it.? The formula for split size is max(minSize, min(max size, block size)). Can u check if it satisfies ur case.? Thanks Regards, Archit Thakur. On Saturday, April 25, 2015, Wenlei Xie wenlei@gmail.com wrote

Re: Custom Partitioning Spark

2015-04-21 Thread Archit Thakur
Hi, This should work. How are you checking the no. of partitions.? Thanks and Regards, Archit Thakur. On Mon, Apr 20, 2015 at 7:26 PM, mas mas.ha...@gmail.com wrote: Hi, I aim to do custom partitioning on a text file. I first convert it into pairRDD and then try to use my custom

Re: Number of input partitions in SparkContext.sequenceFile

2015-04-21 Thread Archit Thakur
Hi, It should generate the same no of partitions as the no. of splits. Howd you check no of partitions.? Also please paste your file size and hdfs-site.xml and mapred-site.xml here. Thanks and Regards, Archit Thakur. On Sat, Apr 18, 2015 at 6:20 PM, Wenlei Xie wenlei@gmail.com wrote: Hi

Re: Running spark over HDFS

2015-04-20 Thread Archit Thakur
There are lot of similar problems shared and resolved by users on this same portal. I have been part of those discussions before, Search those, Please Try them and let us know, if you still face problems. Thanks and Regards, Archit Thakur. On Mon, Apr 20, 2015 at 3:05 PM, madhvi madhvi.gu

Re: mapPartitions vs foreachPartition

2015-04-20 Thread Archit Thakur
The same, which is between map and foreach. map takes iterator returns iterator foreach takes iterator returns Unit. On Mon, Apr 20, 2015 at 4:05 PM, Arun Patel arunp.bigd...@gmail.com wrote: What is difference between mapPartitions vs foreachPartition? When to use these? Thanks, Arun

Re: mapPartitions vs foreachPartition

2015-04-20 Thread Archit Thakur
True. On Mon, Apr 20, 2015 at 4:14 PM, Arun Patel arunp.bigd...@gmail.com wrote: mapPartitions is a transformation and foreachPartition is a an action? Thanks Arun On Mon, Apr 20, 2015 at 4:38 AM, Archit Thakur archit279tha...@gmail.com wrote: The same, which is between map and foreach

Re: Addition of new Metrics for killed executors.

2015-04-20 Thread Archit Thakur
all information present in an executor tabs for running executors. Thanks, Archit Thakur. On Mon, Apr 20, 2015 at 1:31 PM, twinkle sachdeva twinkle.sachd...@gmail.com wrote: Hi Archit, What is your use case and what kind of metrics are you planning to add? Thanks, Twinkle On Fri, Apr 17

Re: Can't get SparkListener to work

2015-04-18 Thread Archit Thakur
Hi Praveen, Can you try once removing throw exception in map. Do you still not get it.? On Apr 18, 2015 8:14 AM, Praveen Balaji secondorderpolynom...@gmail.com wrote: Thanks for the response, Imran. I probably chose the wrong methods for this email. I implemented all methods of SparkListener

Re: Custom partioner

2015-04-18 Thread Archit Thakur
Yes you can. Use partitionby method and pass partitioner to it. On Apr 17, 2015 8:18 PM, Jeetendra Gangele gangele...@gmail.com wrote: Ok is there a way, I can use hash Partitioning so that I can improve the performance? On 17 April 2015 at 19:33, Archit Thakur archit279tha...@gmail.com

Addition of new Metrics for killed executors.

2015-04-17 Thread Archit Thakur
Hi, We are planning to add new Metrics in Spark for the executors that got killed during the execution. Was just curious, why this info is not already present. Is there some reason for not adding it.? Any ideas around are welcome. Thanks and Regards, Archit Thakur.

Re: Joined RDD

2015-04-17 Thread Archit Thakur
map phase of join* On Fri, Apr 17, 2015 at 5:28 PM, Archit Thakur archit279tha...@gmail.com wrote: Ajay, This is true. When we call join again on two RDD's.Rather than computing the whole pipe again, It reads the map output of the map phase of an RDD(which it usually gets from shuffle

Re: Custom partioner

2015-04-17 Thread Archit Thakur
suffling? Also I am running with very less records currently still its shuffling ? regards jeetendra On 17 April 2015 at 15:58, Archit Thakur archit279tha...@gmail.com wrote: I dont think you can change it to 4 bytes without any custom compilation. To make same key go to same node, you'll have

Re: Use Case of mutable RDD - any ideas around will help.

2014-09-12 Thread Archit Thakur
) //might not be needed again. Will our both cases be satisfied, that it uses existingRDDTableName from cache for union and dont duplicate the data in the cache but somehow, append to the older cacheTable. Thanks and Regards, Archit Thakur. Sr Software Developer, Guavus, Inc. On Sat, Sep 13, 2014

Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
) but the request is stuck indefinitely. This works when I set sparkConf.setMaster(yarn-client) I am not sure, why is it not launching job in yarn-cluster mode. Any thoughts? Thanks and Regards, Archit Thakur.

Re: Running Spark On Yarn without Spark-Submit

2014-08-29 Thread Archit Thakur
including user@spark.apache.org. On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur archit279tha...@gmail.com wrote: Hi, My requirement is to run Spark on Yarn without using the script spark-submit. I have a servlet and a tomcat server. As and when request comes, it creates a new SC

Re: Anyone know hot to submit spark job to yarn in java code?

2014-08-29 Thread Archit Thakur
Hi, I am facing the same problem. Did you find any solution or work around? Thanks and Regards, Archit Thakur. On Thu, Jan 16, 2014 at 6:22 AM, Liu, Raymond raymond@intel.com wrote: Hi Regarding your question 1) when I run the above script, which jar is beed submitted to the yarn

Logging in Spark through YARN.

2014-07-30 Thread Archit Thakur
Hi, I want to manage logging of containers when I run Spark through YARN. I checked there is a environment variable exposed to custom log4j.properties. Setting SPARK_LOG4J_CONF to /dir/log4j.properties should ideally make containers use /dir/log4j.properties file for logging. This doesn't seem

Re: java.lang.ClassNotFoundException

2014-05-12 Thread Archit Thakur
Hi Joe, Your messages are going into spam folder for me. Thx, Archit_Thakur. On Fri, May 2, 2014 at 9:22 AM, Joe L selme...@yahoo.com wrote: Hi, You should include the jar file of your project. for example: conf.set(yourjarfilepath.jar) Joe On Friday, May 2, 2014 7:39 AM, proofmoore