Unsubscribe

2018-06-22 Thread Uzi Hadad


Get Outlook for iOS

From: Thiago 
Sent: Thursday, June 14, 2018 9:28:14 PM
To: dev@spark.apache.org
Subject: Unsubscribe


Unsubscribe


Re: RepartitionByKey Behavior

2018-06-22 Thread Nathan Kronenfeld
>
> On Thu, Jun 21, 2018 at 4:51 PM, Chawla,Sumit 
>>> wrote:
>>>
 Hi

  I have been trying to this simple operation.  I want to land all
 values with one key in same partition, and not have any different key in
 the same partition.  Is this possible?   I am getting b and c always
 getting mixed up in the same partition.



I think you could do something approsimately like:

 val keys = rdd.map(_.getKey).distinct.zipWithIndex
 val numKey = keys.map(_._2).count
 rdd.map(r => (r.getKey, r)).join(keys).partitionBy(new Partitioner()
{def numPartitions=numKeys;def getPartition(key: Any) =
key.asInstanceOf[Long].toInt})

i.e., key by a unique number, count that, and repartition by key to the
exact count.  This presumes, of course, that the number of keys is 

Unsubscribe

2018-06-22 Thread Tarun Kumar
Unsubscribe


Re: Jenkins build errors

2018-06-22 Thread Sean Owen
Also confused about this one as many builds succeed. One possible
difference is that this failure is in the Hive tests, so are you building
and testing with -Phive locally where it works? still does not explain the
download failure. It could be a mirror problem, throttling, etc. But there
again haven't spotted another failing Hive test.

On Wed, Jun 20, 2018 at 1:55 AM Petar Zecevic 
wrote:

>
> It's still dying. Back to this error (it used to be spark-2.2.0 before):
>
> java.io.IOException: Cannot run program "./bin/spark-submit" (in directory 
> "/tmp/test-spark/spark-2.1.2"): error=2, No such file or directory
>
> So, a mirror is missing that Spark version... I don't understand why
> nobody else has these errors and I get them every time without fail.
>
>
> Petar
>
>