Re: unsubscribe

2024-02-21 Thread Xin Zhang
unsubscribe

On Tue, Feb 20, 2024 at 9:44 PM kritika jain  wrote:

> Unsubscribe
>
> On Tue, 20 Feb 2024, 3:18 pm Крюков Виталий Семенович,
>  wrote:
>
>>
>> unsubscribe
>>
>>
>>

-- 
Zhang Xin(张欣)
Email:josseph.zh...@gmail.com


Re: Spark 4.0 Query Analyzer Bug Report

2024-02-21 Thread Mich Talebzadeh
Indeed valid points raised including the potential typo in the new spark
version. I suggest, in the meantime, you should look for the so called
alternative debugging methods


   -
   - Simpler  explain(), try basic explain() or explain("extended"). This
   might provide a less detailed, but potentially functional, explanation.
   - Manual Analysis*, *analyze the query structure and logical steps
   yourself
   - Spark UI, review the Spark UI (accessible through your Spark
   application on 4040) for delving into query execution and potential
   bottlenecks.


HTH



Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Wed, 21 Feb 2024 at 08:37, Holden Karau  wrote:

> Do you mean Spark 3.4? 4.0 is very much not released yet.
>
> Also it would help if you could share your query & more of the logs
> leading up to the error.
>
> On Tue, Feb 20, 2024 at 3:07 PM Sharma, Anup 
> wrote:
>
>> Hi Spark team,
>>
>>
>>
>> We ran into a dataframe issue after upgrading from spark 3.1 to 4.
>>
>>
>>
>> query_result.explain(extended=True)\n  File
>> \"…/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py\"
>>
>> raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while 
>> calling z:org.apache.spark.sql.api.python.PythonSQLUtils.explainString.\n: 
>> java.lang.IllegalStateException: You hit a query analyzer bug. Please report 
>> your query to Spark user mailing list.\n\tat 
>> org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:516)\n\tat
>>  
>> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)\n\tat
>>  scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)\n\tat 
>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)\n\tat 
>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)\n\tat 
>> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)\n\tat
>>  
>> org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)\n\tat
>>  
>> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)\n\tat
>>  
>> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)\n\tat
>>  
>> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)\n\tat
>>  scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat 
>> scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat 
>> scala.collection.AbstractIterator.foreach(Iterator.scala:1431)\n\tat 
>> scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)\n\tat 
>> scala.collect...
>>
>>
>>
>>
>>
>> Could you please let us know if this is already being looked at?
>>
>>
>>
>> Thanks,
>>
>> Anup
>>
>
>
> --
> Cell : 425-233-8271
>


Kafka-based Spark Streaming and Vertex AI for Sentiment Analysis

2024-02-21 Thread Mich Talebzadeh
I am working on a pet project to implement a real-time sentiment analysis
system for analyzing customer reviews. It leverages Kafka for data
ingestion, Spark Structured Streaming (SSS) for real-time processing, and
Vertex AI for sentiment analysis and potential action triggers.

*Features*

   - Real-time processing of customer reviews using SSS.
   - Sentiment analysis using pre-assigned labels or Vertex AI
   models.
   - Integration with Vertex AI for model deployment and prediction serving.
   - Potential actions based on sentiment analysis results
   (e.g., notifications, database updates).


*Tech stack*

   - Kafka: Stream processing platform for data ingestion.
   - SSS for real-time data processing on incoming messages with cleansing
   - Vertex AI: Machine learning platform for model training


I have created sample Json data with relevant attributes for product review as
shown below

{
  "rowkey": "7de43681-0e4a-45cb-ad40-5f14f5678333",
  "product_id": "product-id-1616",
  "timereported": "2024-02-21T08:46:40",
  "description": "Easy to use and setup, perfect for beginners.",
  "price": GBP507,
  "sentiment": negative,
  "product_category": "Electronics",
  "customer_id": "customer4",
  "location": "UK",
  "rating": 6,
  "review_text": "Sleek and modern design, but lacking some features.",
  "user_feedback": "Negative",
  "review_source": "online",
  "sentiment_confidence": 0.33,
  "product_features": "user-friendly",
  "timestamp": "",
  "language": "English"
},

I also attached a high level diagram. There is recently a demand for Gemini
usage. Your views are appreciated.


Thanks

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* I am an architect and not a data scientist. The information
provided is correct to the best of my knowledge but of course cannot be
guaranteed . It is essential to note that, as with any advice, quote "one test
result is worth one-thousand expert opinions (Werner
Von Braun
)".

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org