How to partition a SparkDataFrame using all distinct column values in sparkR

2016-07-25 Thread Neil Chang
Hi, This is a question regarding SparkR in spark 2.0. Given that I have a SparkDataFrame and I want to partition it using one column's values. Each value corresponds to a partition, all rows that having the same column value shall go to the same partition, no more no less. Seems the

Re: How to get the number of partitions for a SparkDataFrame in Spark 2.0-preview?

2016-07-23 Thread Neil Chang
ed to open a connection to a > database so its better to re-use that connection for one partition's > elements than create it for each element. > > What are you trying to accomplish with dapply? > > On Fri, Jul 22, 2016 at 8:05 PM, Neil Chang <iam...@gmail.com > <javasc

Re: How to get the number of partitions for a SparkDataFrame in Spark 2.0-preview?

2016-07-22 Thread Neil Chang
<ski.rodrig...@gmail.com> wrote: > This should work and I don't think triggers any actions: > > df.rdd.partitions.length > > On Fri, Jul 22, 2016 at 2:20 PM, Neil Chang <iam...@gmail.com> wrote: > >> Seems no function does this in Spark 2.0 preview? >>

How to get the number of partitions for a SparkDataFrame in Spark 2.0-preview?

2016-07-22 Thread Neil Chang
Seems no function does this in Spark 2.0 preview?

Re: spark worker continuously trying to connect to master and failed in standalone mode

2016-07-22 Thread Neil Chang
settings on the master: >> https://help.ubuntu.com/lts/serverguide/firewall.html >> >> On Jul 19, 2016, at 6:25 PM, Neil Chang <iam...@gmail.com> wrote: >> >> Hi, >> I have two virtual pcs on private cloud (ubuntu 14). I installed spark >> 2.0 preview on

spark worker continuously trying to connect to master and failed in standalone mode

2016-07-19 Thread Neil Chang
Hi, I have two virtual pcs on private cloud (ubuntu 14). I installed spark 2.0 preview on both machines. I then tried to test it with standalone mode. I have no problem start the master. However, when I start the worker (slave) on another machine, it makes many attempts to connect to master and