RE: What are the alternatives to nested DataFrames?

2018-12-29 Thread email
: What are the alternatives to nested DataFrames? 2 options I can think of: 1- Can you perform a union of dfs returned by elastic research queries. It would still be distributed but I don't know if you will run out of how many union operations you can perform at a time. 2- Can you used

Re: What are the alternatives to nested DataFrames?

2018-12-28 Thread Shahab Yunus
m...@yeikel.com > *Cc:* Shahab Yunus ; user > *Subject:* Re: What are the alternatives to nested DataFrames? > > > > Could you join() the DFs on a common key? > > > > On Fri, Dec 28, 2018 at 18:35 wrote: > > Shabad , I am not sure what you are trying to say. Could

RE: What are the alternatives to nested DataFrames?

2018-12-28 Thread email
iginal DF and returns a new dataframe including all the matching terms From: Andrew Melo Sent: Friday, December 28, 2018 8:48 PM To: em...@yeikel.com Cc: Shahab Yunus ; user Subject: Re: What are the alternatives to nested DataFrames? Could you join() the DFs on a common key?

Re: What are the alternatives to nested DataFrames?

2018-12-28 Thread Andrew Melo
tString(0)* > > > > * val qb = QueryBuilders.matchQuery("name", > city).operator(Operator.AND)* > > * print(qb.toString)* > > > > * val dfs = sqlContext.esDF("cities/docs", qb.toString) // null > pointer* > > > > * dfs.show()* > > > &g

RE: What are the alternatives to nested DataFrames?

2018-12-28 Thread email
uery("name", city).operator(Operator.AND) print(qb.toString) val dfs = sqlContext.esDF("cities/docs", qb.toString) // null pointer dfs.show() }) From: Shahab Yunus Sent: Friday, December 28, 2018 12:34 PM To: em...@yeikel.com Cc: user Sub

Re: What are the alternatives to nested DataFrames?

2018-12-28 Thread Shahab Yunus
Can you have a dataframe with a column which stores json (type string)? Or you can also have a column of array type in which you store all cities matching your query. On Fri, Dec 28, 2018 at 2:48 AM wrote: > Hi community , > > > > As shown in other answers online , Spark does not support the n

What are the alternatives to nested DataFrames?

2018-12-27 Thread email
Hi community , As shown in other answers online , Spark does not support the nesting of DataFrames , but what are the options? I have the following scenario : dataFrame1 = List of Cities dataFrame2 = Created after searching in ElasticSearch for each city in dataFrame1 I've tri

Re: Nested DataFrames

2015-06-25 Thread pawan kumar
May be you could try something like this using sparkSQL 1.4 and dataframes student.join(Grade, Grade("student_id") === student("student_id"), "left") .groupBy("id") .agg(sum(grade("Marks")), avg(grade("Marks"))) You could refer to the following document : https://spark.apache.o

RE: Nested DataFrames

2015-06-25 Thread Richard Catlin
I am looking to do something similar to this Postgres query in HiveQL. If I have a DataFrame student and a DataFrame grade, is this possible? I read in Learning Spark: Lightning-Fast Big Data Analysis that it should be possible. It says in Chapter 9 "SchemaRDDs can store several basic types, as