Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread pengyh

good answer. nice to know too.

Sean Owen wrote:

Spark is built with and ships with a copy of Scala. It doesn't use your
local version.


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Data ingestion

2022-08-17 Thread pengyh
from my experience, spark can read/write from/to both mysql and hive 
fluently.


regards.


Akash Vellukai wrote:
How we could do data ingestion from MySQL to Hive with the help of Spark 
streaming and not with Kafka


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Supported Hadoop versions for Spark 3.3

2022-08-15 Thread pengyh
my spark cluster can access either hadoop 2 or 3. so it doesn't care 
what the current hadoop version is.


Håkan Nordgren wrote:
Hi All: Which Hadoop versions (and distributions — Cloudera, 
Hortonworks, etc.) are supported for Spark 3.3 for the “Pre-built with 
user-provided Apache Hadoop” package from 
https://spark.apache.org/downloads.html? 



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Unsubscribe

2022-08-10 Thread pengyh

to unsubscribe: user-unsubscr...@spark.apache.org


Shrikar archak wrote:



unsubscribe


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [Spark SQL] Omit Create Table Statement in Spark Sql

2022-08-09 Thread pengyh

you have to saveAsTable or view to make a SQL query.


As the title, does Spark Sql have a feature like Flink Catalog to omit 
`Create Table` statement, and write sql query directly ?


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Spark Scala API still not updated for 2.13 or it's a mistake?

2022-08-02 Thread pengyh



I can use scala 2.13 for spark-shell, but not spark-submit.

regards.

Spark 3.3.0 supports 2.13, though you need to build it for 2.13. The 
default binary distro uses 2.12.


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



log transfering into hadoop/spark

2022-08-02 Thread pengyh

since flume is not continued to develop.
what's the current opensource tool to transfer webserver logs into
hdfs/spark?

thank you.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Use case idea

2022-08-01 Thread pengyh



* streaming handler is still useful for spark, though there is flink as
alternative
* RDD is also useful for transform especially for non-structure data
* there are many SQL products in market like Drill/Impala, but spark is
more powerful for distributed deployment as far as I know
* we never used spark for AI training, but use keras/pytorch which are
pretty easy for development a model.


Perhaps you should try other systems in the market first, that will give
an unbiased view of databricks and SPARK being just over
glamourised tool. The hope of extending SPARK with a separate easy to
use query engine for deep learning and other AI systems is gone now with
Ray, SPARK community now just defends the lack of support, and direction
in this matter largely, which is a joke.



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: unsubscribe

2022-08-01 Thread pengyh

you could be able to unsubscribe yourself by using the signature below.



To unsubscribe e-mail: user-unsubscr...@spark.apache.org


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Use case idea

2022-07-31 Thread pengyh



I don't think so. we were using spark integarted with Kafka for
streaming computing and realtime reports. that just works.



SPARK is now just an overhyped and overcomplicated ETL tool, nothing
more, there is another distributed AI called as Ray, which should be the
next billion dollar company instead of just building those features in
SPARK natively using a different computation engine :)


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Use case idea

2022-07-31 Thread pengyh



I am afraid the most sql functions spark has the other BI tools also have.

spark is used for high performance computing, not for SQL function
comparisoin.

Thanks.


In other terms: what analytics funcionality, that no One erp has, Spark offers ?


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org