Hi,
What's the deployment process then (if not using spark-submit)? How is the
AM deployed? Why would you want to skip spark-submit?
Jacek
On 19 Mar 2018 00:20, "Serega Sheypak" wrote:
> Hi, Is it even possible to run spark on yarn as usual java application?
> I've built jat using maven with s
Hi,
My dataframe is having 2000 rows. For processing each row it
consider 3 seconds and so sequentially it takes 2000 * 3 = 6000 seconds ,
which is a very high time.
Further, I am contemplating to run the function in parallel.
For example, I would like to divide the tota
Mind if I ask a reproducer? seems returning timestamps fine:
>>> from pyspark.sql.functions import *
>>> spark.range(1).select(to_timestamp(current_timestamp())).printSchema()
root
|-- to_timestamp(current_timestamp()): timestamp (nullable = false)
>>> spark.range(1).select(to_timestamp(current_
Hi
The is not with spark in this case, it is with Oracle. If you do not know
which columns to apply date-related conversion rule, then you have a
problem.
You should try either
a) Define some config file where you can define table name, date column
name and date-format @ source so that you can
The other approach would to write to temp table and then merge the data.
But this may be expensive solution.
Thanks
Deepak
On Mon, Mar 19, 2018, 08:04 Gurusamy Thirupathy wrote:
> Hi,
>
> I am trying to read data from Hive as DataFrame, then trying to write the
> DF into the Oracle data base. I
Hi,
I am trying to read data from Hive as DataFrame, then trying to write the
DF into the Oracle data base. In this case, the date field/column in hive
is with Type Varchar(20)
but the corresponding column type in Oracle is Date. While reading from
hive , the hive table names are dynamically decid
You can use EXPLAIN statement to see optimized plan for each query. (
https://stackoverflow.com/questions/35883620/spark-how-can-get-the-logical-physical-query-execution-using-thirft-hive
).
2018-03-19 0:52 GMT+07:00 CPC :
> Hi nguyen,
>
> Thank you for quick response. But what i am trying to und
Hi, Is it even possible to run spark on yarn as usual java application?
I've built jat using maven with spark-yarn dependency and I manually
populate SparkConf with all hadoop properties.
SparkContext fails to start with exception:
1. Caused by: java.lang.IllegalStateException: Library director
Hi nguyen,
Thank you for quick response. But what i am trying to understand is in both
query predicate evolution require only one column. So actually spark does
not need to read all column in projection if they are not used in filter
predicate. Just to give an example, amazon redshift has this kin
Hi,
Filled https://issues.apache.org/jira/browse/SPARK-23731 and am working on
a workaround (aka fix).
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Maste
Hi @CPC,
Parquet is column storage format, so if you want to read data from only one
column, you can do that without accessing all of your data. Spark SQL
consists of a query optimizer ( see
https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html),
so it will optimi
Hi everybody,
I try to understand how spark reading parquet files but i am confused a
little bit. I have a table with 4 columns and named
businesskey,transactionname,request and response Request and response
columns are huge columns(10-50kb). when i execute a query like
"select * from mytable wher
Thanks a lot!
2018-03-18 9:30 GMT+01:00 Denis Bolshakov :
> Please checkout.
>
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand
>
>
> and
>
> org.apache.spark.sql.execution.datasources.WriteRelation
>
>
> I guess it's managed by
>
> job.getConfiguration.set(DATASOURC
This is a good start:
https://github.com/deanwampler/JustEnoughScalaForSpark
And the corresponding talk:
https://www.youtube.com/watch?v=LBoSgiLV_NQ
There're many more resources if you search for it.
-kr, Gerard.
On Sun, Mar 18, 2018 at 11:15 AM, Mahender Sarangam <
mahender.bigd...@outlook.com
Hi,
Can any one share with me nice tutorials on Spark with Scala like
videos, blogs for beginners. Mostly focusing on writing scala code.
Thanks in advance.
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
I'm new to Spark and Scala, need help on transforming Nested JSON using Scala.
We have upstream returning JSON like
{
"id": 100,
"text": "Hello, world."
Users : [ "User1": {
"name": "Brett",
"id": 200,
"Type" : "Employee"
"empid":"2"
},
"Use
Please checkout.
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand
and
org.apache.spark.sql.execution.datasources.WriteRelation
I guess it's managed by
job.getConfiguration.set(DATASOURCE_WRITEJOBUUID, uniqueWriteJobId.toString)
On 17 March 2018 at 20:46, Serega
I am submitting a script to spark-submit and passing it a file using
--files property. Later on I need to read it in a worker.
I don't understand what API I should use to do that. I figured I'd try just:
with open('myfile'):
but this did not work.
I am able to pass the file using the addFile me
18 matches
Mail list logo