Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread pengyh
good answer. nice to know too. Sean Owen wrote: Spark is built with and ships with a copy of Scala. It doesn't use your local version. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Data ingestion

2022-08-17 Thread pengyh
from my experience, spark can read/write from/to both mysql and hive fluently. regards. Akash Vellukai wrote: How we could do data ingestion from MySQL to Hive with the help of Spark streaming and not with Kafka - To

Re: Supported Hadoop versions for Spark 3.3

2022-08-15 Thread pengyh
my spark cluster can access either hadoop 2 or 3. so it doesn't care what the current hadoop version is. Håkan Nordgren wrote: Hi All: Which Hadoop versions (and distributions — Cloudera, Hortonworks, etc.) are supported for Spark 3.3 for the “Pre-built with user-provided Apache Hadoop”

Re: Unsubscribe

2022-08-10 Thread pengyh
to unsubscribe: user-unsubscr...@spark.apache.org Shrikar archak wrote: unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: [Spark SQL] Omit Create Table Statement in Spark Sql

2022-08-09 Thread pengyh
you have to saveAsTable or view to make a SQL query. As the title, does Spark Sql have a feature like Flink Catalog to omit `Create Table` statement, and write sql query directly ? - To unsubscribe e-mail:

Re: Spark Scala API still not updated for 2.13 or it's a mistake?

2022-08-02 Thread pengyh
I can use scala 2.13 for spark-shell, but not spark-submit. regards. Spark 3.3.0 supports 2.13, though you need to build it for 2.13. The default binary distro uses 2.12. - To unsubscribe e-mail:

log transfering into hadoop/spark

2022-08-02 Thread pengyh
since flume is not continued to develop. what's the current opensource tool to transfer webserver logs into hdfs/spark? thank you. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Use case idea

2022-08-01 Thread pengyh
* streaming handler is still useful for spark, though there is flink as alternative * RDD is also useful for transform especially for non-structure data * there are many SQL products in market like Drill/Impala, but spark is more powerful for distributed deployment as far as I know * we never

Re: unsubscribe

2022-08-01 Thread pengyh
you could be able to unsubscribe yourself by using the signature below. To unsubscribe e-mail: user-unsubscr...@spark.apache.org - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Use case idea

2022-07-31 Thread pengyh
I don't think so. we were using spark integarted with Kafka for streaming computing and realtime reports. that just works. SPARK is now just an overhyped and overcomplicated ETL tool, nothing more, there is another distributed AI called as Ray, which should be the next billion dollar company

Re: Use case idea

2022-07-31 Thread pengyh
I am afraid the most sql functions spark has the other BI tools also have. spark is used for high performance computing, not for SQL function comparisoin. Thanks. In other terms: what analytics funcionality, that no One erp has, Spark offers ?