Re: about broadcast join of base table in spark sql

2017-07-01 Thread Xiaoye Sun
you may need to check if spark can get the size of your table. If spark cannot get the table size, it won't do broadcast. On Sat, Jul 1, 2017 at 11:37 PM Paley Louie wrote: > Thank you for your reply, I have tried to add broadcast hint to the base > table, but it just

Re: about broadcast join of base table in spark sql

2017-07-01 Thread Paley Louie
Thank you for your reply, I have tried to add broadcast hint to the base table, but it just cannot be broadcast out. > On Jun 30, 2017, at 9:13 PM, Yong Zhang wrote: > > Or since you already use the DataFrame API, instead of SQL, you can add the > broadcast function to

Re: about broadcast join of base table in spark sql

2017-07-01 Thread Paley Louie
Thank you for your reply, I have tried to set parameter spark.sql.autoBroadcastJoinThreshold to high enough value, however it does not work, I think broadcast of base table is disabled in spark. > On Jun 30, 2017, at 6:57 PM, Bryan Jeffrey wrote: > > Hello. > > If

Re: json in Cassandra to RDDs

2017-07-01 Thread ayan guha
Hi If you are asking how to parse the json column from Cassandra, I would suggest you to look into from_json function. It would help you to parse a json field, given you know the schema upfront. On Sat, Jul 1, 2017 at 8:54 PM, Conconscious wrote: > Hi list, > > I'm

json in Cassandra to RDDs

2017-07-01 Thread Conconscious
Hi list, I'm using Cassandra with only 2 fields (id, json). I'm using Spark to query the json. Until now I can use a json file and query that file, but Cassandra and RDDs of the json field not yet. sc = spark.sparkContext path = "/home/me/red50k.json" redirectsDF = spark.read.json(path)