Think about how you want to distribute your data and how your keys are
spread currently. Do you want to compute something per day, per week etc.
Based on that, return a partition number. You could use mod 30 or some such
function to get the partitions.
On Nov 18, 2015 5:17 AM, "prateek arora" <prateek.arora...@gmail.com> wrote:

> Hi
> I am trying to implement custom partitioner using this link
> http://stackoverflow.com/questions/23127329/how-to-define-custom-partitioner-for-spark-rdds-of-equally-sized-partition-where
> ( in link example key value is from 0 to (noOfElement - 1))
>
> but not able to understand how i  implement  custom partitioner  in my
> case:
>
> my parent RDD have 4 partition and RDD key is : TimeStamp and Value is
> JPEG Byte Array
>
>
> Regards
> Prateek
>
>
> On Tue, Nov 17, 2015 at 9:28 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Please take a look at the following for example:
>>
>> ./core/src/main/scala/org/apache/spark/api/python/PythonPartitioner.scala
>> ./core/src/main/scala/org/apache/spark/Partitioner.scala
>>
>> Cheers
>>
>> On Tue, Nov 17, 2015 at 9:24 AM, prateek arora <
>> prateek.arora...@gmail.com> wrote:
>>
>>> Hi
>>> Thanks
>>> I am new in spark development so can you provide some help to write a
>>> custom partitioner to achieve this.
>>> if you have and link or example to write custom partitioner please
>>> provide to me.
>>>
>>> On Mon, Nov 16, 2015 at 6:13 PM, Sabarish Sasidharan <
>>> sabarish.sasidha...@manthan.com> wrote:
>>>
>>>> You can write your own custom partitioner to achieve this
>>>>
>>>> Regards
>>>> Sab
>>>> On 17-Nov-2015 1:11 am, "prateek arora" <prateek.arora...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I have a RDD with 30 record ( Key/value pair ) and running 30 executor
>>>>> . i
>>>>> want to reparation this RDD in to 30 partition so every partition  get
>>>>> one
>>>>> record and assigned to one executor .
>>>>>
>>>>> when i used rdd.repartition(30) its repartition my rdd in 30 partition
>>>>> but
>>>>> some partition get 2 record , some get 1 record and some not getting
>>>>> any
>>>>> record .
>>>>>
>>>>> is there any way in spark so i can evenly distribute my record in all
>>>>> partition .
>>>>>
>>>>> Regards
>>>>> Prateek
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/how-can-evenly-distribute-my-records-in-all-partition-tp25394.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>>
>>>
>>
>

Reply via email to