subject:"How to insert data into 2000 partitions\(directories\) of ORC\/parquet at a time using Spark SQL\?"

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-15 Thread swetha kasireddy

Hi Mich, No I have not tried that. My requirement is to insert that from an hourly Spark Batch job. How is it different by trying to insert with Hive CLI or beeline? Thanks, Swetha On Tue, Jun 14, 2016 at 10:44 AM, Mich Talebzadeh wrote: > Hi Swetha, > > Have you

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-14 Thread Mich Talebzadeh

Hi Swetha, Have you actually tried doing this in Hive using Hive CLI or beeline? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-14 Thread Mich Talebzadeh

In all probability there is no user database created in Hive Create a database yourself sql("create if not exists database test") It would be helpful if you grasp some concept of Hive databases etc? HTH Dr Mich Talebzadeh LinkedIn *

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-14 Thread swetha kasireddy

Hi Bijay, This approach might not work for me as I have to do partial inserts/overwrites in a given table and data_frame.write.partitionBy will overwrite the entire table. Thanks, Swetha On Mon, Jun 13, 2016 at 9:25 PM, Bijay Pathak wrote: > Hi Swetha, > > One

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-14 Thread Sree Eedupuganti

Hi Spark users, i am new to spark. I am trying to connect hive using SparkJavaContext. Unable to connect to the database. By executing the below code i can see only "default" database. Can anyone help me out. What i need is a sample program for Querying Hive results using SparkJavaContext. Need to

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-13 Thread Bijay Pathak

Hi Swetha, One option is to use Hive with the above issues fixed which is Hive 2.0 or Cloudera CDH Hive 1.2 which has above issue resolved. One thing to remember is it's not the Hive you have installed but the Hive Spark is using which in Spark 1.6 is Hive version 1.2 as of now. The workaround I

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-13 Thread swetha kasireddy

Hi Mich, Following is a sample code snippet: *val *userDF = userRecsDF.toDF("idPartitioner", "dtPartitioner", "userId", "userRecord").persist() System.*out*.println(" userRecsDF.partitions.size"+ userRecsDF.partitions.size) userDF.registerTempTable("userRecordsTemp") sqlContext.sql("SET

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-13 Thread swetha kasireddy

Hi Bijay, If I am hitting this issue, https://issues.apache.org/jira/browse/HIVE-11940. What needs to be done? Incrementing to higher version of hive is the only solution? Thanks! On Mon, Jun 13, 2016 at 10:47 AM, swetha kasireddy < swethakasire...@gmail.com> wrote: > Hi, > > Following is a

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-13 Thread swetha kasireddy

Hi, Following is a sample code snippet: *val *userDF = userRecsDF.toDF("idPartitioner", "dtPartitioner", "userId", "userRecord").persist() System.*out*.println(" userRecsDF.partitions.size"+ userRecsDF.partitions.size) userDF.registerTempTable("userRecordsTemp") sqlContext.sql("SET

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-10 Thread Bijay Pathak

Hello, Looks like you are hitting this: https://issues.apache.org/jira/browse/HIVE-11940. Thanks, Bijay On Thu, Jun 9, 2016 at 9:25 PM, Mich Talebzadeh wrote: > cam you provide a code snippet of how you are populating the target table > from temp table. > > > HTH

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread Mich Talebzadeh

cam you provide a code snippet of how you are populating the target table from temp table. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw *

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread swetha kasireddy

No, I am reading the data from hdfs, transforming it , registering the data in a temp table using registerTempTable and then doing insert overwrite using Spark SQl' hiveContext. On Thu, Jun 9, 2016 at 3:40 PM, Mich Talebzadeh wrote: > how are you doing the insert?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread Mich Talebzadeh

how are you doing the insert? from an existing table? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread swetha kasireddy

400 cores are assigned to this job. On Thu, Jun 9, 2016 at 1:16 PM, Stephen Boesch wrote: > How many workers (/cpu cores) are assigned to this job? > > 2016-06-09 13:01 GMT-07:00 SRK : > >> Hi, >> >> How to insert data into 2000

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread Stephen Boesch

How many workers (/cpu cores) are assigned to this job? 2016-06-09 13:01 GMT-07:00 SRK : > Hi, > > How to insert data into 2000 partitions(directories) of ORC/parquet at a > time using Spark SQL? It seems to be not performant when I try to insert > 2000 directories of

How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

2016-06-09 Thread SRK

Hi, How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL? It seems to be not performant when I try to insert 2000 directories of Parquet/ORC using Spark SQL. Did anyone face this issue? Thanks! -- View this message in context:

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

Re: How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

How to insert data into 2000 partitions(directories) of ORC/parquet at a time using Spark SQL?

16 matches

Site Navigation

Mail list logo

Footer information