Unsubscribe
Which version of spark you are runinng? Have you created Kafka-Directstream
? I am asking coz you might / might not be using receivers.
Also, When you say hangs, you mean there is no other log after this and
process still up?
Or do you mean, it kept on adding the jobs but did nothing else. (I am
@TD: Doesn't transformWith need both of the DStreams to be of same
slideDuration.
[Spark Version: 1.3.1]
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/using-multiple-dstreams-together-spark-streaming-tp9947p24839.html
Sent from the Apache Spark User List
Hi.
No. of partitions are determined by the RDD it uses in the plan it creates.
It uses NewHadoopRDD which gives partitions by getSplits of input format it
is using. It uses FilteringParquetRowInputFormat subclass of
ParquetInputFormat. To change the no of partitions write a new input format
and
Hi,
How did u check no of splits in ur file. Did i run ur mr job or calculated
it.?
The formula for split size is
max(minSize, min(max size, block size)). Can u check if it satisfies ur
case.?
Thanks Regards,
Archit Thakur.
On Saturday, April 25, 2015, Wenlei Xie wenlei@gmail.com wrote
Hi,
This should work. How are you checking the no. of partitions.?
Thanks and Regards,
Archit Thakur.
On Mon, Apr 20, 2015 at 7:26 PM, mas mas.ha...@gmail.com wrote:
Hi,
I aim to do custom partitioning on a text file. I first convert it into
pairRDD and then try to use my custom
Hi,
It should generate the same no of partitions as the no. of splits.
Howd you check no of partitions.? Also please paste your file size and
hdfs-site.xml and mapred-site.xml here.
Thanks and Regards,
Archit Thakur.
On Sat, Apr 18, 2015 at 6:20 PM, Wenlei Xie wenlei@gmail.com wrote:
Hi
There are lot of similar problems shared and resolved by users on this same
portal. I have been part of those discussions before, Search those, Please
Try them and let us know, if you still face problems.
Thanks and Regards,
Archit Thakur.
On Mon, Apr 20, 2015 at 3:05 PM, madhvi madhvi.gu
The same, which is between map and foreach. map takes iterator returns
iterator foreach takes iterator returns Unit.
On Mon, Apr 20, 2015 at 4:05 PM, Arun Patel arunp.bigd...@gmail.com wrote:
What is difference between mapPartitions vs foreachPartition?
When to use these?
Thanks,
Arun
True.
On Mon, Apr 20, 2015 at 4:14 PM, Arun Patel arunp.bigd...@gmail.com wrote:
mapPartitions is a transformation and foreachPartition is a an action?
Thanks
Arun
On Mon, Apr 20, 2015 at 4:38 AM, Archit Thakur archit279tha...@gmail.com
wrote:
The same, which is between map and foreach
all information present in an
executor tabs for running executors.
Thanks,
Archit Thakur.
On Mon, Apr 20, 2015 at 1:31 PM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi Archit,
What is your use case and what kind of metrics are you planning to add?
Thanks,
Twinkle
On Fri, Apr 17
Hi Praveen,
Can you try once removing throw exception in map. Do you still not get it.?
On Apr 18, 2015 8:14 AM, Praveen Balaji secondorderpolynom...@gmail.com
wrote:
Thanks for the response, Imran. I probably chose the wrong methods for
this email. I implemented all methods of SparkListener
Yes you can. Use partitionby method and pass partitioner to it.
On Apr 17, 2015 8:18 PM, Jeetendra Gangele gangele...@gmail.com wrote:
Ok is there a way, I can use hash Partitioning so that I can improve the
performance?
On 17 April 2015 at 19:33, Archit Thakur archit279tha...@gmail.com
Hi,
We are planning to add new Metrics in Spark for the executors that got
killed during the execution. Was just curious, why this info is not already
present. Is there some reason for not adding it.?
Any ideas around are welcome.
Thanks and Regards,
Archit Thakur.
map phase of join*
On Fri, Apr 17, 2015 at 5:28 PM, Archit Thakur archit279tha...@gmail.com
wrote:
Ajay,
This is true. When we call join again on two RDD's.Rather than computing
the whole pipe again, It reads the map output of the map phase of an
RDD(which it usually gets from shuffle
suffling?
Also I am running with very less records currently still its shuffling ?
regards
jeetendra
On 17 April 2015 at 15:58, Archit Thakur archit279tha...@gmail.com
wrote:
I dont think you can change it to 4 bytes without any custom compilation.
To make same key go to same node, you'll have
) //might not be needed again.
Will our both cases be satisfied, that it uses existingRDDTableName from
cache for union and dont duplicate the data in the cache but somehow,
append to the older cacheTable.
Thanks and Regards,
Archit Thakur.
Sr Software Developer,
Guavus, Inc.
On Sat, Sep 13, 2014
)
but the request is stuck indefinitely.
This works when I set
sparkConf.setMaster(yarn-client)
I am not sure, why is it not launching job in yarn-cluster mode.
Any thoughts?
Thanks and Regards,
Archit Thakur.
including user@spark.apache.org.
On Fri, Aug 29, 2014 at 2:03 PM, Archit Thakur archit279tha...@gmail.com
wrote:
Hi,
My requirement is to run Spark on Yarn without using the script
spark-submit.
I have a servlet and a tomcat server. As and when request comes, it
creates a new SC
Hi,
I am facing the same problem.
Did you find any solution or work around?
Thanks and Regards,
Archit Thakur.
On Thu, Jan 16, 2014 at 6:22 AM, Liu, Raymond raymond@intel.com wrote:
Hi
Regarding your question
1) when I run the above script, which jar is beed submitted to the yarn
Hi,
I want to manage logging of containers when I run Spark through YARN. I
checked there is a environment variable exposed to custom log4j.properties.
Setting SPARK_LOG4J_CONF to /dir/log4j.properties should ideally make
containers use
/dir/log4j.properties file for logging. This doesn't seem
Hi Joe,
Your messages are going into spam folder for me.
Thx, Archit_Thakur.
On Fri, May 2, 2014 at 9:22 AM, Joe L selme...@yahoo.com wrote:
Hi, You should include the jar file of your project. for example:
conf.set(yourjarfilepath.jar)
Joe
On Friday, May 2, 2014 7:39 AM, proofmoore
22 matches
Mail list logo