Re:

2022-04-02 Thread Bitfox
Nice reading. Can you give a comparison on Hive on MR3 and Hive on Tez? Thanks On Sat, Apr 2, 2022 at 7:17 PM Sungwoo Park wrote: > Hi Spark users, > > We have published an article where we evaluate the performance of Spark > 2.3.8 and Spark 3.2.1 (along with Hive 3). If interested, please

Re: out of memory error

2022-03-29 Thread Bitfox
tiny. Hadoop ecosystem is usually > memory-intensive > > Missatge de Bitfox del dia dt., 29 de març 2022 a les > 14:46: > >> Yes, a quite small table with 1 rows for test purposes. >> >> Thanks >> >> On Tue, Mar 29, 2022 at 8:43 PM Pau Tallada wrote: >&

Re: out of memory error

2022-03-29 Thread Bitfox
Yes, a quite small table with 1 rows for test purposes. Thanks On Tue, Mar 29, 2022 at 8:43 PM Pau Tallada wrote: > Hi, > > I think it depends a lot on the data volume you are trying to process. > Does it work with a smaller table? > > Missatge de Bitfox del dia dt., 29

Re: out of memory error

2022-03-29 Thread Bitfox
l gets the same error. please help. thanks. On Tue, Mar 29, 2022 at 8:32 PM Pau Tallada wrote: > I assume you have to increase container size (if using tez/yarn) > > Missatge de Bitfox del dia dt., 29 de març 2022 a les > 14:30: > >> My hive run out of memory even for a small

out of memory error

2022-03-29 Thread Bitfox
My hive run out of memory even for a small query: 2022-03-29T20:26:51,440 WARN [Thread-1329] mapred.LocalJobRunner: job_local300585280_0011 java.lang.Exception: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)

Re: Hive 3 with tez issue

2022-03-28 Thread Bitfox
Or, is there a standard installation guide for integration tez and hive3? Thank you. On Mon, Mar 28, 2022 at 12:21 PM Bitfox wrote: > When I had this config in hive-env.sh: > > export > HADOOP_CLASSPATH=/opt/tez/conf:/opt/tez/*:/opt/tez/lib/*:$HADOOP_CLASSPATH > > > > a

Re: Hive 3 with tez issue

2022-03-28 Thread Bitfox
Or, is there a standard installation guide for integration tez and hive3? Thank you. On Mon, Mar 28, 2022 at 12:21 PM Bitfox wrote: > When I had this config in hive-env.sh: > > export > HADOOP_CLASSPATH=/opt/tez/conf:/opt/tez/*:/opt/tez/lib/*:$HADOOP_CLASSPATH > > > > a

Hive 3 with tez issue

2022-03-27 Thread Bitfox
"1.8.0_321" All of them were installed in a local node for development purposes. Please help with this issue. Thanks. Bitfox

Hive 3 with tez issue

2022-03-27 Thread Bitfox
"1.8.0_321" All of them were installed in a local node for development purposes. Please help with this issue. Thanks. Bitfox

Question for so many SQL tools

2022-03-25 Thread Bitfox
Just a question why there are so many SQL based tools existing for data jobs? The ones I know, Spark Flink Ignite Impala Drill Hive … They are doing the similar jobs IMO. Thanks

Re: GraphX Support

2022-03-25 Thread Bitfox
BTW , is MLlib still in active development? Thanks On Tue, Mar 22, 2022 at 07:11 Sean Owen wrote: > GraphX is not active, though still there and does continue to build and > test with each Spark release. GraphFrames kind of superseded it, but is > also not super active FWIW. > > On Mon, Mar

Re: Continuous ML model training in stream mode

2022-03-18 Thread Bitfox
For online recommendation systems, continuous training is needed. :) And we are a living video player, the content is changing every minute, so a real time rec system is the must. On Fri, Mar 18, 2022 at 3:31 AM Sean Owen wrote: > (Thank you, not sure that was me though) > I don't know of

Re: Continuous ML model training in stream mode

2022-03-18 Thread Bitfox
we are keeping the training with the input content from a streaming. But the framework is tensorflow not spark. On Wed, Mar 16, 2022 at 4:46 AM Artemis User wrote: > Has anyone done any experiments of training an ML model using stream > data? especially for unsupervised models? Any

Re: Does Apache Kafka support IPv6/IPv4 or IPv6-only networks?

2022-03-17 Thread Bitfox
>From my experience, it supports both. On Thu, Mar 17, 2022 at 10:18 PM 5 wrote: > Hi, everyone, does Apache Kafka support IPv6/IPv4 or IPv6-only networks?

Play data development with Scala and Spark

2022-03-16 Thread Bitfox
Hello, I have written a free book which is available online, giving a beginner introduction to Scala and Spark development. https://github.com/bitfoxtop/Play-Data-Development-with-Scala-and-Spark/blob/main/PDDWS2-v1.pdf If you can read Chinese then you are welcome to give any feedback. I will

Re: Question on List to DF

2022-03-16 Thread Bitfox
g): DataFrame{ > ….. > } > } > > and a implicit converter > implicit def convertListToMyList(list: List): MyList { > > …. > } > > when you do > List("apple","orange","cherry").toDF("fruit") > > > > Internall

Question on List to DF

2022-03-15 Thread Bitfox
I am wondering why the list in scala spark can be converted into a dataframe directly? scala> val df = List("apple","orange","cherry").toDF("fruit") *df*: *org.apache.spark.sql.DataFrame* = [fruit: string] scala> df.show +--+ | fruit| +--+ | apple| |orange| |cherry| +--+ I

Re: Unsubscribe

2022-03-11 Thread Bitfox
please send an empty email to: user-unsubscr...@spark.apache.org to unsubscribe yourself from the list. On Sat, Mar 12, 2022 at 2:42 PM Aziret Satybaldiev < satybaldiev.azi...@gmail.com> wrote: >

insufficient memory

2022-03-10 Thread Bitfox
Hello My VM has only 4gb memory, 2gb free for use. When I run drill-embedded i got the error: OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0007, 4294967296, 0) failed; error='Not enough space' (errno=12) # # There is insufficient memory for the Java Runtime

Hive with tez engine gets the error

2022-03-10 Thread Bitfox
Hive with tez engine can't run. errors: 0: jdbc:hive2://localhost:1/default> select * from people; Error: java.io.IOException: java.io.IOException: com.google.protobuf.ServiceException: java.lang.NoSuchFieldError: PARSER (state=,code=0) Apache Hive (version 2.3.9) Hadoop 3.3.1 Tez: I

Re: Hive 3 and Java 11 issue

2022-03-10 Thread Bitfox
That sounds bad. All our apps are running on JDK 11. On Thu, Mar 10, 2022 at 5:06 PM Pau Tallada wrote: > I think only JDK8 is supported yet > > Missatge de Bitfox del dia dj., 10 de març 2022 a les > 2:39: > >> my java version: >> >> openjdk version "11

Hive 3 and Java 11 issue

2022-03-09 Thread Bitfox
my java version: openjdk version "11.0.13" 2021-10-19 I can't run hive 3.1.2. The error include: Exception in thread "main" java.lang.ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader

Re: protobuf.ServiceException

2022-03-09 Thread Bitfox
guess that's where your problem lies. > > On Thu, 2022-03-10 at 06:57 +0800, Bitfox wrote: > > Hello > > In beeline I am getting the error: > > 0: jdbc:hive2://localhost:1/default> select * from people; > > Error: java.io.IOException: java.io.IOException:

protobuf.ServiceException

2022-03-09 Thread Bitfox
Hello In beeline I am getting the error: 0: jdbc:hive2://localhost:1/default> select * from people; Error: java.io.IOException: java.io.IOException: com.google.protobuf.ServiceException: java.lang.NoSuchFieldError: PARSER (state=,code=0) Apache Hive (version 2.3.9) Hadoop 3.3.1 $

Re: question about a beeline variable

2022-02-27 Thread Bitfox
I got the idea it's the null value in Hive. 0: jdbc:hive2://localhost:1/default> select size(null); +--+ | _c0 | +--+ | -1 | +--+ Thanks On Sun, Feb 27, 2022 at 4:02 PM Bitfox wrote: > what does this -1 value mean? > > > set mapr

question about a beeline variable

2022-02-27 Thread Bitfox
what does this -1 value mean? > set mapred.reduce.tasks; +-+ | set | +-+ | mapred.reduce.tasks=-1 | +-+ 1 row selected (0.014 seconds)

Re: Issue while creating spark app

2022-02-26 Thread Bitfox
hanks > Rajat > > On Sun, Feb 27, 2022, 00:52 Bitfox wrote: > >> You need to install scala first, the current version for spark is 2.12.15 >> I would suggest you install scala by sdk which works great. >> >> Thanks >> >> On Sun, Feb 27, 2022 at

Re: Issue while creating spark app

2022-02-26 Thread Bitfox
You need to install scala first, the current version for spark is 2.12.15 I would suggest you install scala by sdk which works great. Thanks On Sun, Feb 27, 2022 at 12:10 AM rajat kumar wrote: > Hello Users, > > I am trying to create spark application using Scala(Intellij). > I have installed

Re: [E] COMMERCIAL BULK: Re: TensorFlow on Spark

2022-02-24 Thread Bitfox
extending the dataframes > from SPARK to deep learning and other frameworks by natively integrating > them. > > > Regards, > Gourav Sengupta > > > On Wed, Feb 23, 2022 at 4:42 PM Dennis Suhari > wrote: > >> Currently we are trying AnalyticsZoo and Ray >> &

Re: help with beeline connection to hive

2022-02-23 Thread Bitfox
e or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >&g

Re: One click to run Spark on Kubernetes

2022-02-23 Thread Bitfox
from my viewpoints, if there is such a pay as you go service I would like to use. otherwise I have to deploy a regular spark cluster with GCP/AWS etc and the cost is not low. Thanks. On Wed, Feb 23, 2022 at 4:00 PM bo yang wrote: > Right, normally people start with simple script, then add more

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
or will pick up the CRD and launch the > Spark application. The one click tool intends to hide these details, so > people could just submit Spark and do not need to deal with too many > deployment details. > > On Tue, Feb 22, 2022 at 8:09 PM Bitfox wrote: > >> Can it be a

Re: One click to run Spark on Kubernetes

2022-02-22 Thread Bitfox
Can it be a cluster installation of spark? or just the standalone node? Thanks On Wed, Feb 23, 2022 at 12:06 PM bo yang wrote: > Hi Spark Community, > > We built an open source tool to deploy and run Spark on Kubernetes with a > one click command. For example, on AWS, it could automatically

Re: [E] COMMERCIAL BULK: Re: TensorFlow on Spark

2022-02-22 Thread Bitfox
tensorflow itself can implement the distributed computing via a parameter server. Why did you want spark here? regards. On Wed, Feb 23, 2022 at 11:27 AM Vijayant Kumar wrote: > Thanks Sean for your response. !! > > > > Want to add some more background here. > > > > I am using Spark3.0+ version

help with beeline connection to hive

2022-02-22 Thread Bitfox
Hello I have hive 2.3.9 installed by default on localhost for testing. HDFS is also installed on localhost, which works correctly b/c I have already used the file storage feature. I didn't change any configure files for hive. I can login into hive shell: hive> show databases; OK default

Re: Unsubscribe

2022-02-09 Thread Bitfox
Please send an e-mail: user-unsubscr...@spark.apache.org to unsubscribe yourself from the mailing list. On Thu, Feb 10, 2022 at 1:38 AM Yogitha Ramanathan wrote: >

Re: Help With unstructured text file with spark scala

2022-02-09 Thread Bitfox
time. > > > > Relação de Beneficiários Ativos e Excluídos >> Carteira em#27/12/2019##Todos os Beneficiários >> Operadora#AMIL >> Filial#SÃO PAULO#Unidade#Guarulhos >> >> Contrato#123456 - Test >> Empresa#Test > > > On 9 Feb 2022, at 00:58, Bit

Re: Help With unstructured text file with spark scala

2022-02-08 Thread Bitfox
Hello You can treat it as a csf file and load it from spark: >>> df = spark.read.format("csv").option("inferSchema", "true").option("header", "true").option("sep","#").load(csv_file) >>> df.show() ++---+-+ | Plano|Código

Re: add an auto_increment column

2022-02-08 Thread Bitfox
Maybe col func is not even needed here. :) >>> df.select(F.dense_rank().over(wOrder).alias("rank"), "fruit","amount").show() ++--+--+ |rank| fruit|amount| ++--+--+ | 1|cherry| 5| | 2| apple| 3| | 2|tomato| 3| | 3|orange| 2|

foreachRDD question

2022-02-07 Thread Bitfox
Hello list, for the code in the link: https://github.com/apache/spark/blob/v3.2.1/examples/src/main/scala/org/apache/spark/examples/streaming/SqlNetworkWordCount.scala I am not sure, why enclose the RDD to Dataframe logic in a foreachRDD block? What's the use of foreachRDD? Thanks in advance.

Re: Unsubscribe

2022-02-05 Thread Bitfox
Please send an e-mail: user-unsubscr...@spark.apache.org to unsubscribe yourself from the mailing list. On Sun, Feb 6, 2022 at 2:21 PM Rishi Raj Tandon wrote: > Unsubscribe >

Re: Python performance

2022-02-04 Thread Bitfox
Please see my this test: https://blog.cloudcache.net/computing-performance-comparison-for-words-statistics/ Don’t use Python RDD, using dataframe instead. Regards On Fri, Feb 4, 2022 at 5:02 PM Hinko Kocevar wrote: > I'm looking into using Python interface with Spark and came across this >

Re:

2022-01-31 Thread Bitfox
Please send an e-mail: user-unsubscr...@spark.apache.org to unsubscribe yourself from the mailing list. On Mon, Jan 31, 2022 at 10:11 PM wrote: > unsubscribe > > >

Re:

2022-01-31 Thread Bitfox
Please send an e-mail: user-unsubscr...@spark.apache.org to unsubscribe yourself from the mailing list. On Mon, Jan 31, 2022 at 10:23 PM Gaetano Fabiano wrote: > Unsubscribe > > Inviato da iPhone > > - > To unsubscribe e-mail:

Re: unsubscribe

2022-01-31 Thread Bitfox
The signature in your messages has showed how to unsubscribe. To unsubscribe e-mail: user-unsubscr...@spark.apache.org On Mon, Jan 31, 2022 at 7:53 PM Lucas Schroeder Rossi wrote: > unsubscribe > > - > To unsubscribe e-mail:

Re: why the pyspark RDD API is so slow?

2022-01-31 Thread Bitfox
the same time as they (Scala > and Python) use the same API under the hood. Therefore you can also observe > that APIs are very similar and code is written in the same fashion. > > > On Sun, 30 Jan 2022, 10:10 Bitfox, wrote: > >> Hello list, >> >> I did a compar

Re: [ANNOUNCE] Apache Kyuubi (Incubating) released 1.4.1-incubating

2022-01-30 Thread Bitfox
What’s the difference between Spark and Kyuubi? Thanks On Mon, Jan 31, 2022 at 2:45 PM Vino Yang wrote: > Hi all, > > The Apache Kyuubi (Incubating) community is pleased to announce that > Apache Kyuubi (Incubating) 1.4.1-incubating has been released! > > Apache Kyuubi (Incubating) is a

Re: unsubscribe

2022-01-30 Thread Bitfox
The signature in your mail has showed the info: To unsubscribe e-mail: user-unsubscr...@spark.apache.org On Sun, Jan 30, 2022 at 8:50 PM Lucas Schroeder Rossi wrote: > unsubscribe > > - > To unsubscribe e-mail:

why the pyspark RDD API is so slow?

2022-01-30 Thread Bitfox
Hello list, I did a comparison for pyspark RDD, scala RDD, pyspark dataframe and a pure scala program. The result shows the pyspark RDD is too slow. For the operations and dataset please see: https://blog.cloudcache.net/computing-performance-comparison-for-words-statistics/ The result table is

Re: [ANNOUNCE] Apache Spark 3.2.1 released

2022-01-28 Thread Bitfox
Is there a guide for upgrading from 3.2.0 to 3.2.1? thanks On Sat, Jan 29, 2022 at 9:14 AM huaxin gao wrote: > We are happy to announce the availability of Spark 3.2.1! > > Spark 3.2.1 is a maintenance release containing stability fixes. This > release is based on the branch-3.2 maintenance

Re: [ANNOUNCE] Apache Kafka 3.1.0

2022-01-24 Thread Bitfox
Must spark3, kafka3, scala3, python3 work together if my project used these stacks? Thanks On Tue, Jan 25, 2022 at 1:04 AM David Jacot wrote: > The Apache Kafka community is pleased to announce the release for > Apache Kafka 3.1.0. > > It is a major release that includes many new features,

may I need a join here?

2022-01-23 Thread Bitfox
rom word#0,count#1L in operator !Filter NOT word#0 IN (stopword#4).; !Filter NOT word#0 IN (stopword#4) +- LogicalRDD [word#0, count#1L], false The filter method doesn't work here. Maybe I need a join for two DF? What's the syntax for this? Thank you and regards, Bitfox

Question about ports in spark

2022-01-23 Thread Bitfox
Hello When spark started in my home server, I saw there were two ports open then. 8080 for master, 8081 for worker. If I keep these two ports open without any network filter, does it have security issues? Thanks

Re: [RELEASE CANDIDATE] mod_perl-2.0.12 RC2

2022-01-08 Thread Bitfox
Is there any update on libapr? Thanks On Sun, Jan 9, 2022 at 2:31 AM Steve Hay wrote: > On Sat, 18 Dec 2021 at 11:21, Steve Hay wrote: > > > > Please download, test, and report back on this mod_perl 2.0.12 release > > candidate. > > > > Still waiting to see the necessary votes from other

Re: Regarding contribution to Apache-Beam

2022-01-04 Thread Bitfox
Hello Maybe begin from this content? https://beam.apache.org/contribute/ Thanks On Wed, Jan 5, 2022 at 1:43 PM Devangi Das wrote: > Hello! > I want to contribute to Apache Beam .I have a fair knowledge of java and > python but I'm new to Go language.kindly guide me how to start contributing >

Re: How to make batch filter

2022-01-02 Thread Bitfox
damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun,

Re: How to make batch filter

2022-01-02 Thread Bitfox
OM > filters)").rdd.getNumPartitions() > 10 > ==== > > Please do refer to the following page for adaptive sql execution in SPARK > 3, it will be of massive help particularly in case you are handling skewed >

Re: How to make batch filter

2022-01-02 Thread Bitfox
imed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun, 2 Jan 2022 at 00:20, Bitfox wrote: > >> One more question, for this big filter, given my server has 4 Cores, will >> spark (

Re: How to make batch filter

2022-01-01 Thread Bitfox
> from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sat, 1 Jan 2022 at 20:59, Bitfox wrote: > >> Using the datafr

Re: How to make batch filter

2022-01-01 Thread Bitfox
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any mon

How to make batch filter

2022-01-01 Thread Bitfox
Using the dataframe API I need to implement a batch filter: DF. select(..).where(col(..) != ‘a’ and col(..) != ‘b’ and …) There are a lot of keywords should be filtered for the same column in where statement. How can I make it more smater? UDF or others? Thanks & Happy new Year! Bitfox

Re: [ANNOUNCE] Apache Pulsar 2.7.4 released

2021-12-27 Thread bitfox
What's new features on the streaming development then? thanks On 2021-12-27 22:52, guo jiwei wrote: The Apache Pulsar team is proud to announce Apache Pulsar version 2.7.4. Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub

my first data science project with spark

2021-12-26 Thread bitfox
in Spark I want to share it here. Thanks for your reviews. regards Bitfox - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: measure running time

2021-12-24 Thread bitfox
others who have met the same issue. Happy holidays. :0 Bitfox On 2021-12-25 09:48, Hollis wrote: Replied mail From Mich Talebzadeh Date 12/25/2021 00:25 To Sean Owen

df.show() to text file

2021-12-24 Thread bitfox
Hello list, spark newbie here :0 How can I write the df.show() result to a text file in the system? I run with pyspark, not the python client programming. Thanks. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: measure running time

2021-12-24 Thread bitfox
As you see below: $ pip install sparkmeasure Collecting sparkmeasure Using cached https://files.pythonhosted.org/packages/9f/bf/c9810ff2d88513ffc185e65a3ab9df6121ad5b4c78aa8d134a06177f9021/sparkmeasure-0.14.0-py2.py3-none-any.whl Installing collected packages: sparkmeasure Successfully

Re: measure running time

2021-12-24 Thread bitfox
but I already installed it: Requirement already satisfied: sparkmeasure in /usr/local/lib/python2.7/dist-packages so how? thank you. On 2021-12-24 18:15, Hollis wrote: Hi bitfox, you need pip install sparkmeasure firstly. then can lanch in pysaprk. from sparkmeasure import StageMetrics

Dataframe's storage size

2021-12-23 Thread bitfox
Hello Is it possible to know a dataframe's total storage size in bytes? such as: df.size() Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/sql/dataframe.py", line 1660, in __getattr__ "'%s' object has no attribute '%s'" %

Re: measure running time

2021-12-23 Thread bitfox
Hello list, I run with Spark 3.2.0 After I started pyspark with: $ pyspark --packages ch.cern.sparkmeasure:spark-measure_2.12:0.17 I can't load from the module sparkmeasure: from sparkmeasure import StageMetrics Traceback (most recent call last): File "", line 1, in ModuleNotFoundError:

Re: measure running time

2021-12-23 Thread bitfox
Thanks Gourav and Luca. I will try with the tools you provide in the Github. On 2021-12-23 23:40, Luca Canali wrote: Hi, I agree with Gourav that just measuring execution time is a simplistic approach that may lead you to miss important details, in particular when running distributed

measure running time

2021-12-23 Thread bitfox
hello community, In pyspark how can I measure the running time to the command? I just want to compare the running time of the RDD API and dataframe API, in my this blog: https://bitfoxtop.wordpress.com/2021/12/23/count-email-addresses-using-sparks-rdd-and-dataframe/ I tried spark.time() it

Re: Unable to use WriteStream to write to delta file.

2021-12-17 Thread bitfox
May I ask why you don’t use spark.read and spark.write instead of readStream and writeStream? Thanks. On 2021-12-17 15:09, Abhinav Gundapaneni wrote: Hello Spark community, I’m using Apache spark(version 3.2) to read a CSV file to a dataframe using ReadStream, process the dataframe and write

issue on define a dataframe

2021-12-14 Thread bitfox
Hello, Spark newbie here :) Why I can't create the dataframe with just one column? for instance, this works: df=spark.createDataFrame([("apple",2),("orange",3)],["name","count"]) But this can't work: df=spark.createDataFrame([("apple"),("orange")],["name"]) Traceback (most recent call

Re: About some Spark technical assistance

2021-12-12 Thread bitfox
github url please. On 2021-12-13 01:06, sam smith wrote: Hello guys, I am replicating a paper's algorithm (graph coloring algorithm) in Spark under Java, and thought about asking you guys for some assistance to validate / review my 600 lines of code. Any volunteers to share the code with ?

Re: creating database issue

2021-12-07 Thread bitfox
) at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source) ... 105 more Thanks. On 2021/12/8 9:28, bitfox wrote: Hello This is just a standalone deployment for testing purpose. The version: Spark 3.2.0 (git revision 5d45a415f3) built for Hadoop

Re: creating database issue

2021-12-07 Thread bitfox
Hello This is just a standalone deployment for testing purpose. The version: Spark 3.2.0 (git revision 5d45a415f3) built for Hadoop 3.3.1 Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3.2 -Phive -Phive-thriftserver I just started one master and one worker for the

creating database issue

2021-12-07 Thread bitfox
sorry I am newbie to spark. When I created a database in pyspark shell following the book content of learning spark 2.0, it gets: >>> spark.sql("CREATE DATABASE learn_spark_db") 21/12/08 09:01:34 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist 21/12/08 09:01:34 WARN

Re: [OUTREACH} December '21 Edition of 'Happenings in the Neighborhood' is out now

2021-12-06 Thread bitfox
Is there a blog for comparison between Apache Pulsar and Apache Spark? Thanks On 2021-12-07 09:46, Aaron Williams wrote: Hello Apache Pulsar Neighbors, For this issue [1], For this issue, we have three new committers, a new milestone, and lots of talks. Plus our normal features of a Stack