Unsubscribe

2018-02-11 Thread Sandeep Varma
Sandeep Varma Principal ZS Associates India Pvt. Ltd. World Trade Center, Tower 3, Kharadi, Pune 411014, Maharashtra, India T | +91 20 6739 5224 M | +91 97 6633 0103 www.zs.com ZS Impact where it matters. Notice: This message,

Re: Schema - DataTypes.NullType

2018-02-11 Thread Nicholas Hakobian
I spent a few minutes poking around in the source code and found this: The data type representing None, used for the types that cannot be inferred. https://github.com/apache/spark/blob/branch-2.1/python/pyspark/sql/types.py#L107-L113 Playing around a bit, this is the only use case that I could

Re: optimize hive query to move a subset of data from one partition table to another table

2018-02-11 Thread amit kumar singh
Hi create table emp as select * from emp_full where join_date >=date_sub(join_date,2) i am trying to select from one table insert into another table i need a way to do select last 2 month of data everytime table is partitioned on year month day On Sun, Feb 11, 2018 at 4:30 PM, Richard Qiao

Re: Unsubscribe

2018-02-11 Thread purna pradeep
Unsubscribe

Re: optimize hive query to move a subset of data from one partition table to another table

2018-02-11 Thread Richard Qiao
Would you mind share your code with us to analyze? > On Feb 10, 2018, at 10:18 AM, amit kumar singh wrote: > > Hi Team, > > We have hive external table which has 50 tb of data partitioned on year > month day > > i want to move last 2 month of data into another table >

Re: saveAsTable does not respect spark.sql.warehouse.dir

2018-02-11 Thread Lian Jiang
Thanks guys. prashanth's idea worked for me. Appreciate very much! On Sun, Feb 11, 2018 at 10:20 AM, prashanth t wrote: > Hi Lian, > > Please add below command before creating table. > "Use (database_name)" > By default saveAsTable uses default database of hive. You

Unsubscribe

2018-02-11 Thread Archit Thakur
Unsubscribe

Re: saveAsTable does not respect spark.sql.warehouse.dir

2018-02-11 Thread prashanth t
Hi Lian, Please add below command before creating table. "Use (database_name)" By default saveAsTable uses default database of hive. You might not have access to it that's causing problems. Thanks Prashanth Thipparthi On 11 Feb 2018 10:45 pm, "Lian Jiang" wrote: I

Re: Spark cannot find tables in Oracle database

2018-02-11 Thread Lian Jiang
Thanks Guys for help! Georg's proposal fixed the issue. Thanks a lot. On Sun, Feb 11, 2018 at 7:59 AM, Georg Heiler wrote: > I had the same problem. You need to uppercase all tables prior to storing > them in oracle. > Gourav Sengupta

Re: Spark cannot find tables in Oracle database

2018-02-11 Thread Georg Heiler
I had the same problem. You need to uppercase all tables prior to storing them in oracle. Gourav Sengupta schrieb am So. 11. Feb. 2018 um 10:44: > Hi, > > since you are using the same user as the schema, I do not think that there > is an access issue. Perhaps you might

Re: Apache Spark - Structured Streaming Query Status - field descriptions

2018-02-11 Thread M Singh
Thanks Richard.  I am hoping that Spark team will at some time, provide more detailed documentation. On Sunday, February 11, 2018 2:17 AM, Richard Qiao wrote: Can find a good source for documents, but the source code

Re: Schema - DataTypes.NullType

2018-02-11 Thread Jean Georges Perrin
What is the purpose of DataTypes.NullType, specially as you are building a schema? Have anyone used it or seen it as spart of a schema auto-generation? (If I keep asking long enough, I may get an answer, no? :) ) > On Feb 4, 2018, at 13:15, Jean Georges Perrin wrote: > > Any

Re: Apache Spark - Structured Streaming Query Status - field descriptions

2018-02-11 Thread Richard Qiao
Can find a good source for documents, but the source code “org.apache.spark.sql.execution.streaming.ProgressReporter” is helpful to answer some of them. For example: inputRowsPerSecond = numRecords / inputTimeSec, processedRowsPerSecond = numRecords / processingTimeSec This is explaining

Re: Spark cannot find tables in Oracle database

2018-02-11 Thread Gourav Sengupta
Hi, since you are using the same user as the schema, I do not think that there is an access issue. Perhaps you might want to see whether there is anything case sensitive about the the table names. I remember once that the table names had to be in small letters, but that was in MYSQL. Regards,

Re: Spark cannot find tables in Oracle database

2018-02-11 Thread Jörn Franke
Maybe you do not have access to the table/view. Incase of a view it could be also that you do not have access to the underlying table. Have you tried with another sql tool to access it? > On 11. Feb 2018, at 03:26, Lian Jiang wrote: > > Hi, > > I am following >

Re: Spark Dataframe and HIVE

2018-02-11 Thread रविशंकर नायर
Hi, So , is this a bug, or something I need to fix? If its our issue, how can we fix? Please help. Best, On Sun, Feb 11, 2018 at 3:49 AM, Shmuel Blitz wrote: > Your table is missing a "PARTITIONED BY " section. > > Spark 2.x save the partition information in the

Re: Spark Dataframe and HIVE

2018-02-11 Thread Shmuel Blitz
Your table is missing a "PARTITIONED BY " section. Spark 2.x save the partition information in the TBLPROPERTIES section. On Sun, Feb 11, 2018 at 10:41 AM, Deepak Sharma wrote: > I can see its trying to read the parquet and failing while decompressing > using snappy:

Re: Spark Dataframe and HIVE

2018-02-11 Thread Deepak Sharma
I can see its trying to read the parquet and failing while decompressing using snappy: parquet.hadoop.ParquetRecordReader.nextKeyValue( ParquetRecordReader.java:201) So the table looks good but this needs to be fixed before you can query the data in hive. Thanks Deepak On Sun, Feb 11, 2018 at

Re: Spark Dataframe and HIVE

2018-02-11 Thread रविशंकर नायर
When I do that , and then do a select, full of errors. I think Hive table to read. select * from mine; OK SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder

Re: Spark Dataframe and HIVE

2018-02-11 Thread रविशंकर नायर
Why did it matter? So, are you saying Spark SQL cannot create tables in HIVE? If I need to use HiveCOntext, how should I change my code? Best, On Sun, Feb 11, 2018 at 3:09 AM, Deepak Sharma wrote: > I think this is the problem here. > You created the table using the

Re: Spark Dataframe and HIVE

2018-02-11 Thread Deepak Sharma
There was a typo: Instead of : alter table mine set locations "hdfs://localhost:8020/user/ hive/warehouse/mine"; Use : alter table mine set location "hdfs://localhost:8020/user/ hive/warehouse/mine"; On Sun, Feb 11, 2018 at 1:38 PM, Deepak Sharma wrote: > Try this in

Re: Spark Dataframe and HIVE

2018-02-11 Thread रविशंकर नायर
Sorry Mich. I did not create using an explicit create statement Instead I used below: //Created a data frame loading from MYSQL passion_df.write.saveAsTable("default.mine") After logging into HIVE, HIVE shows the table. But cannot select the data. On Sun, Feb 11, 2018 at 3:08 AM, ☼ R Nair

Re: Spark Dataframe and HIVE

2018-02-11 Thread Deepak Sharma
I think this is the problem here. You created the table using the spark sql and not the hive sql context. Thanks Deepak On Sun, Feb 11, 2018 at 1:36 PM, Mich Talebzadeh wrote: > simple question have you created the table through spark sql or hive? > > I recall

Re: Spark Dataframe and HIVE

2018-02-11 Thread Deepak Sharma
Try this in hive: alter table mine set locations "hdfs://localhost:8020/ user/hive/warehouse/mine"; Thanks Deepak On Sun, Feb 11, 2018 at 1:24 PM, ☼ R Nair (रविशंकर नायर) < ravishankar.n...@gmail.com> wrote: > Hi, > Here you go: > > hive> show create table mine; > OK > CREATE TABLE `mine`( >

Re: Spark Dataframe and HIVE

2018-02-11 Thread रविशंकर नायर
I have created it using Spark SQL. Then I want to retrieve from HIVE. Thats where the issue is. I can , still retrieve from Spark. No problems. Why HIVE is not giving me the data?? On Sun, Feb 11, 2018 at 3:06 AM, Mich Talebzadeh wrote: > simple question have you

Re: Spark Dataframe and HIVE

2018-02-11 Thread Mich Talebzadeh
simple question have you created the table through spark sql or hive? I recall similar issues a while back. val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) //val sqlContext = new HiveContext(sc) println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(),