How did the RDD.union work

2014-11-11 Thread qiaou
Hi: I got a problem with using the union method of RDD things like this I get a function like def hbaseQuery(area:string):RDD[Result]= ??? when i use hbaseQuery('aa').union(hbaseQuery(‘bb’)).count() it returns 0 however when use like this

Re: How did the RDD.union work

2014-11-11 Thread Shixiong Zhu
Could you provide the code of hbaseQuery? It maybe doesn't support to execute in parallel. Best Regards, Shixiong Zhu 2014-11-12 14:32 GMT+08:00 qiaou qiaou8...@gmail.com: Hi: I got a problem with using the union method of RDD things like this I get a function like def

回复: How did the RDD.union work

2014-11-11 Thread qiaou
ok here is the code def hbaseQuery:(String)=RDD[Result] = { val generateRdd = (area:String)={ val startRowKey = s$area${RowKeyUtils.convertToHex(startId, 10)} val stopRowKey = s$area${RowKeyUtils.convertToHex(endId, 10)}

回复: How did the RDD.union work

2014-11-11 Thread qiaou
this work! but can you explain why should use like this? -- qiaou 已使用 Sparrow (http://www.sparrowmailapp.com/?sig) 在 2014年11月12日 星期三,下午3:18,Shixiong Zhu 写道: You need to create a new configuration for each RDD. Therefore, val hbaseConf = HBaseConfigUtil.getHBaseConfiguration should be

Re: How did the RDD.union work

2014-11-11 Thread Shixiong Zhu
The `conf` object will be sent to other nodes via Broadcast. Here is the scaladoc of Broadcast: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.broadcast.Broadcast In addition, the object v should not be modified after it is broadcast in order to ensure that all nodes