[jira] [Created] (SPARK-48362) Add CollectSetWIthLimit

2024-05-20 Thread Holden Karau (Jira)
Holden Karau created SPARK-48362:


 Summary: Add CollectSetWIthLimit
 Key: SPARK-48362
 URL: https://issues.apache.org/jira/browse/SPARK-48362
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Holden Karau


See 
[https://stackoverflow.com/questions/38730912/how-to-limit-functions-collect-set-in-spark-sql]

 

Some users want to collect a set but if the number of distinct elements is too 
large they may get a Cannot grow BufferHolder  error from trying to collect the 
set then trim it.

 

We should offer a collect set which pre-emptively does not add more elements 
than needed to reduce the amount of memory used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[LincolnTalk] Stucco

2024-05-16 Thread sarah cannon holden
Several months ago I sent out a request for STUCCO recommendations,  This
is what I got,

TAG Plastering
Corbett Plastering
Hub Masonry Inc

I reached out.  I got no response.  I set a date with Tony at Corbett and
he didn't show up.

SO.  Any suggestions?  Any other recommendations?

Thank you.  Sarah
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[jira] [Assigned] (SPARK-44953) Log a warning (or automatically disable) when shuffle tracking is enabled along side another DA supported mechanism

2024-05-13 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau reassigned SPARK-44953:


Assignee: binjie yang

> Log a warning (or automatically disable) when shuffle tracking is enabled 
> along side another DA supported mechanism
> ---
>
> Key: SPARK-44953
> URL: https://issues.apache.org/jira/browse/SPARK-44953
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Holden Karau
>Assignee: binjie yang
>Priority: Major
>  Labels: pull-request-available
>
> Some people enable both shuffle tracking and another mechanism (like 
> migration) and then are confused when their jobs don't scale down.
>  
> We should at least log a warning here (or automatically disable shuffle 
> tracking?) when it is configured alongside another DA supported mechanism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44953) Log a warning (or automatically disable) when shuffle tracking is enabled along side another DA supported mechanism

2024-05-13 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau resolved SPARK-44953.
--
Resolution: Fixed

> Log a warning (or automatically disable) when shuffle tracking is enabled 
> along side another DA supported mechanism
> ---
>
> Key: SPARK-44953
> URL: https://issues.apache.org/jira/browse/SPARK-44953
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Holden Karau
>Assignee: binjie yang
>Priority: Major
>  Labels: pull-request-available
>
> Some people enable both shuffle tracking and another mechanism (like 
> migration) and then are confused when their jobs don't scale down.
>  
> We should at least log a warning here (or automatically disable shuffle 
> tracking?) when it is configured alongside another DA supported mechanism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



(spark) branch master updated: [SPARK-44953][CORE] Log a warning when shuffle tracking is enabled along side another DA supported mechanism

2024-05-13 Thread holden
This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a101c48dd965 [SPARK-44953][CORE] Log a warning when shuffle tracking 
is enabled along side another DA supported mechanism
a101c48dd965 is described below

commit a101c48dd9650d2bca2047b91f9e2a3ba90f142d
Author: zwangsheng 
AuthorDate: Mon May 13 13:33:34 2024 -0700

[SPARK-44953][CORE] Log a warning when shuffle tracking is enabled along 
side another DA supported mechanism

### What changes were proposed in this pull request?

Log a warning when shuffle tracking is enabled along side another DA 
supported mechanism

### Why are the changes needed?

Some users enable both shuffle tracking and another mechanism (like 
migration) and then are confused when their jobs don't scale down.

https://issues.apache.org/jira/browse/SPARK-44953

### Does this PR introduce _any_ user-facing change?

Yes, user can find the warning log when enabled both shuffle tracking and 
another DA supported mechanism(shuffle decommission).

### How was this patch tested?

No

### Was this patch authored or co-authored using generative AI tooling?

NO

Closes #45454 from zwangsheng/SPARK-44953.

Authored-by: zwangsheng 
Signed-off-by: Holden Karau 
---
 .../scala/org/apache/spark/ExecutorAllocationManager.scala | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala 
b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
index 3bfa1ae0d4dc..1fe02eec3a07 100644
--- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
+++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
@@ -206,11 +206,13 @@ private[spark] class ExecutorAllocationManager(
   throw new SparkException(
 s"s${DYN_ALLOCATION_SUSTAINED_SCHEDULER_BACKLOG_TIMEOUT.key} must be > 
0!")
 }
+val shuffleTrackingEnabled = 
conf.get(config.DYN_ALLOCATION_SHUFFLE_TRACKING_ENABLED)
+val shuffleDecommissionEnabled = decommissionEnabled &&
+  conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED)
 if (!conf.get(config.SHUFFLE_SERVICE_ENABLED) && !reliableShuffleStorage) {
-  if (conf.get(config.DYN_ALLOCATION_SHUFFLE_TRACKING_ENABLED)) {
+  if (shuffleTrackingEnabled) {
 logInfo("Dynamic allocation is enabled without a shuffle service.")
-  } else if (decommissionEnabled &&
-  conf.get(config.STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED)) {
+  } else if (shuffleDecommissionEnabled) {
 logInfo("Shuffle data decommission is enabled without a shuffle 
service.")
   } else if (!testing) {
 throw new SparkException("Dynamic allocation of executors requires one 
of the " +
@@ -224,6 +226,12 @@ private[spark] class ExecutorAllocationManager(
   }
 }
 
+if (shuffleTrackingEnabled && (shuffleDecommissionEnabled || 
reliableShuffleStorage)) {
+  logWarning("You are enabling both shuffle tracking and other DA 
supported mechanism, " +
+"which will cause idle executors not to be released in a timely, " +
+"please check the configurations.")
+}
+
 if (executorAllocationRatio > 1.0 || executorAllocationRatio <= 0.0) {
   throw new SparkException(
 s"${DYN_ALLOCATION_EXECUTOR_ALLOCATION_RATIO.key} must be > 0 and <= 
1.0")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Re: [DISCUSS] Spark 4.0.0 release

2024-05-08 Thread Holden Karau
That looks cool, maybe let’s split off a thread on how to improve our
release processes?

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Wed, May 8, 2024 at 9:31 AM Erik Krogen  wrote:

> On that note, GitHub recently released (public preview) a new feature
> called Artifact Attestions which may be relevant/useful here: Introducing
> Artifact Attestations–now in public beta - The GitHub Blog
> <https://github.blog/2024-05-02-introducing-artifact-attestations-now-in-public-beta/>
>
> On Wed, May 8, 2024 at 9:06 AM Nimrod Ofek  wrote:
>
>> I have no permissions so I can't do it but I'm happy to help (although I
>> am more familiar with Gitlab CICD than Github Actions).
>> Is there some point of contact that can provide me needed context and
>> permissions?
>> I'd also love to see why the costs are high and see how we can reduce
>> them...
>>
>> Thanks,
>> Nimrod
>>
>> On Wed, May 8, 2024 at 8:26 AM Holden Karau 
>> wrote:
>>
>>> I think signing the artifacts produced from a secure CI sounds like a
>>> good idea. I know we’ve been asked to reduce our GitHub action usage but
>>> perhaps someone interested could volunteer to set that up.
>>>
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>>
>>> On Tue, May 7, 2024 at 9:43 PM Nimrod Ofek 
>>> wrote:
>>>
>>>> Hi,
>>>> Thanks for the reply.
>>>>
>>>> From my experience, a build on a build server would be much more
>>>> predictable and less error prone than building on some laptop- and of
>>>> course much faster to have builds, snapshots, release candidates, early
>>>> previews releases, release candidates or final releases.
>>>> It will enable us to have a preview version with current changes-
>>>> snapshot version, either automatically every day or if we need to save
>>>> costs (although build is really not expensive) - with a click of a button.
>>>>
>>>> Regarding keys for signing. - that's what vaults are for, all across
>>>> the industry we are using vaults (such as hashicorp vault)- but if the
>>>> build will be automated and the only thing which will be manual is to sign
>>>> the release for security reasons that would be reasonable.
>>>>
>>>> Thanks,
>>>> Nimrod
>>>>
>>>>
>>>> בתאריך יום ד׳, 8 במאי 2024, 00:54, מאת Holden Karau ‏<
>>>> holden.ka...@gmail.com>:
>>>>
>>>>> Indeed. We could conceivably build the release in CI/CD but the final
>>>>> verification / signing should be done locally to keep the keys safe (there
>>>>> was some concern from earlier release processes).
>>>>>
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>>
>>>>> On Tue, May 7, 2024 at 10:55 AM Nimrod Ofek 
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Sorry for the novice question, Wenchen - the release is done manually
>>>>>> from a laptop? Not using a CI CD process on a build server?
>>>>>>
>>>>>> Thanks,
>>>>>> Nimrod
>>>>>>
>>>>>> On Tue, May 7, 2024 at 8:50 PM Wenchen Fan 
>>>>>> wrote:
>>>>>>
>>>>>>> UPDATE:
>>>>>>>
>>>>>>> Unfortunately, it took me quite some time to set up my laptop and
>>>>>>> get it ready for the release process (docker desktop doesn't work 
>>>>>>> anymore,
>>>>>>> my pgp key is lost, etc.). I'll start the RC process at my tomorrow. 
>>>>>>> Thanks
>>>>>>> for your patience!
>>>>>>>
>>>>>>> Wenchen
>>>>>>>
>>>>>>> On Fri, May 3, 2024 at 7:47 AM yangjie01 
>>>>>>

Re: [DISCUSS] Spark 4.0.0 release

2024-05-07 Thread Holden Karau
I think signing the artifacts produced from a secure CI sounds like a good
idea. I know we’ve been asked to reduce our GitHub action usage but perhaps
someone interested could volunteer to set that up.

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Tue, May 7, 2024 at 9:43 PM Nimrod Ofek  wrote:

> Hi,
> Thanks for the reply.
>
> From my experience, a build on a build server would be much more
> predictable and less error prone than building on some laptop- and of
> course much faster to have builds, snapshots, release candidates, early
> previews releases, release candidates or final releases.
> It will enable us to have a preview version with current changes- snapshot
> version, either automatically every day or if we need to save costs
> (although build is really not expensive) - with a click of a button.
>
> Regarding keys for signing. - that's what vaults are for, all across the
> industry we are using vaults (such as hashicorp vault)- but if the build
> will be automated and the only thing which will be manual is to sign the
> release for security reasons that would be reasonable.
>
> Thanks,
> Nimrod
>
>
> בתאריך יום ד׳, 8 במאי 2024, 00:54, מאת Holden Karau ‏<
> holden.ka...@gmail.com>:
>
>> Indeed. We could conceivably build the release in CI/CD but the final
>> verification / signing should be done locally to keep the keys safe (there
>> was some concern from earlier release processes).
>>
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>>
>> On Tue, May 7, 2024 at 10:55 AM Nimrod Ofek 
>> wrote:
>>
>>> Hi,
>>>
>>> Sorry for the novice question, Wenchen - the release is done manually
>>> from a laptop? Not using a CI CD process on a build server?
>>>
>>> Thanks,
>>> Nimrod
>>>
>>> On Tue, May 7, 2024 at 8:50 PM Wenchen Fan  wrote:
>>>
>>>> UPDATE:
>>>>
>>>> Unfortunately, it took me quite some time to set up my laptop and get
>>>> it ready for the release process (docker desktop doesn't work anymore, my
>>>> pgp key is lost, etc.). I'll start the RC process at my tomorrow. Thanks
>>>> for your patience!
>>>>
>>>> Wenchen
>>>>
>>>> On Fri, May 3, 2024 at 7:47 AM yangjie01  wrote:
>>>>
>>>>> +1
>>>>>
>>>>>
>>>>>
>>>>> *发件人**: *Jungtaek Lim 
>>>>> *日期**: *2024年5月2日 星期四 10:21
>>>>> *收件人**: *Holden Karau 
>>>>> *抄送**: *Chao Sun , Xiao Li ,
>>>>> Tathagata Das , Wenchen Fan <
>>>>> cloud0...@gmail.com>, Cheng Pan , Nicholas Chammas
>>>>> , Dongjoon Hyun ,
>>>>> Cheng Pan , Spark dev list ,
>>>>> Anish Shrigondekar 
>>>>> *主题**: *Re: [DISCUSS] Spark 4.0.0 release
>>>>>
>>>>>
>>>>>
>>>>> +1 love to see it!
>>>>>
>>>>>
>>>>>
>>>>> On Thu, May 2, 2024 at 10:08 AM Holden Karau 
>>>>> wrote:
>>>>>
>>>>> +1 :) yay previews
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 1, 2024 at 5:36 PM Chao Sun  wrote:
>>>>>
>>>>> +1
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 1, 2024 at 5:23 PM Xiao Li  wrote:
>>>>>
>>>>> +1 for next Monday.
>>>>>
>>>>>
>>>>>
>>>>> We can do more previews when the other features are ready for preview.
>>>>>
>>>>>
>>>>>
>>>>> Tathagata Das  于2024年5月1日周三 08:46写道:
>>>>>
>>>>> Next week sounds great! Thank you Wenchen!
>>>>>
>>>>>
>>>>>
>>>>> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan 
>>>>> wrote:
>>>>>
>>>>> Yea I think a preview release won't hurt (without a branch cut). We
>>>>> don't need to wait for all the ongoing projects to be ready. How about we
>>>>> do a 4.0 preview release based on the current master branch next Monday?
>>>>>
>>&g

Re: [DISCUSS] Spark 4.0.0 release

2024-05-07 Thread Holden Karau
Indeed. We could conceivably build the release in CI/CD but the final
verification / signing should be done locally to keep the keys safe (there
was some concern from earlier release processes).

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Tue, May 7, 2024 at 10:55 AM Nimrod Ofek  wrote:

> Hi,
>
> Sorry for the novice question, Wenchen - the release is done manually from
> a laptop? Not using a CI CD process on a build server?
>
> Thanks,
> Nimrod
>
> On Tue, May 7, 2024 at 8:50 PM Wenchen Fan  wrote:
>
>> UPDATE:
>>
>> Unfortunately, it took me quite some time to set up my laptop and get it
>> ready for the release process (docker desktop doesn't work anymore, my pgp
>> key is lost, etc.). I'll start the RC process at my tomorrow. Thanks for
>> your patience!
>>
>> Wenchen
>>
>> On Fri, May 3, 2024 at 7:47 AM yangjie01  wrote:
>>
>>> +1
>>>
>>>
>>>
>>> *发件人**: *Jungtaek Lim 
>>> *日期**: *2024年5月2日 星期四 10:21
>>> *收件人**: *Holden Karau 
>>> *抄送**: *Chao Sun , Xiao Li ,
>>> Tathagata Das , Wenchen Fan <
>>> cloud0...@gmail.com>, Cheng Pan , Nicholas Chammas <
>>> nicholas.cham...@gmail.com>, Dongjoon Hyun ,
>>> Cheng Pan , Spark dev list ,
>>> Anish Shrigondekar 
>>> *主题**: *Re: [DISCUSS] Spark 4.0.0 release
>>>
>>>
>>>
>>> +1 love to see it!
>>>
>>>
>>>
>>> On Thu, May 2, 2024 at 10:08 AM Holden Karau 
>>> wrote:
>>>
>>> +1 :) yay previews
>>>
>>>
>>>
>>> On Wed, May 1, 2024 at 5:36 PM Chao Sun  wrote:
>>>
>>> +1
>>>
>>>
>>>
>>> On Wed, May 1, 2024 at 5:23 PM Xiao Li  wrote:
>>>
>>> +1 for next Monday.
>>>
>>>
>>>
>>> We can do more previews when the other features are ready for preview.
>>>
>>>
>>>
>>> Tathagata Das  于2024年5月1日周三 08:46写道:
>>>
>>> Next week sounds great! Thank you Wenchen!
>>>
>>>
>>>
>>> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan  wrote:
>>>
>>> Yea I think a preview release won't hurt (without a branch cut). We
>>> don't need to wait for all the ongoing projects to be ready. How about we
>>> do a 4.0 preview release based on the current master branch next Monday?
>>>
>>>
>>>
>>> On Wed, May 1, 2024 at 11:06 PM Tathagata Das <
>>> tathagata.das1...@gmail.com> wrote:
>>>
>>> Hey all,
>>>
>>>
>>>
>>> Reviving this thread, but Spark master has already accumulated a huge
>>> amount of changes.  As a downstream project maintainer, I want to really
>>> start testing the new features and other breaking changes, and it's hard to
>>> do that without a Preview release. So the sooner we make a Preview release,
>>> the faster we can start getting feedback for fixing things for a great
>>> Spark 4.0 final release.
>>>
>>>
>>>
>>> So I urge the community to produce a Spark 4.0 Preview soon even if
>>> certain features targeting the Delta 4.0 release are still incomplete.
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Apr 17, 2024 at 8:35 AM Wenchen Fan  wrote:
>>>
>>> Thank you all for the replies!
>>>
>>>
>>>
>>> To @Nicholas Chammas  : Thanks for cleaning
>>> up the error terminology and documentation! I've merged the first PR and
>>> let's finish others before the 4.0 release.
>>>
>>> To @Dongjoon Hyun  : Thanks for driving the
>>> ANSI on by default effort! Now the vote has passed, let's flip the config
>>> and finish the DataFrame error context feature before 4.0.
>>>
>>> To @Jungtaek Lim  : Ack. We can treat the
>>> Streaming state store data source as completed for 4.0 then.
>>>
>>> To @Cheng Pan  : Yea we definitely should have a
>>> preview release. Let's collect more feedback on the ongoing projects and
>>> then we can propose a date for the preview release.
>>>
>>>
>>>
>>> On Wed, Apr 17, 2024 at 1:22 PM Cheng Pan  wrote:
>>>
>>> will we have preview release for 4.0.0 like we did for 2.0.0 and 3.0.0?
>>>
>

Re: ASF board report draft for May

2024-05-06 Thread Holden Karau
I trust Wenchen to manage the preview release effectively but if there are
concerns around how to manage a developer preview release lets split that
off from the board report discussion.

On Mon, May 6, 2024 at 10:44 AM Mich Talebzadeh 
wrote:

> I did some historical digging on this.
>
> Whilst both preview release and RCs are pre-release versions, the main
> difference lies in their maturity and readiness for production use. Preview
> releases are early versions aimed at gathering feedback, while release
> candidates (RCs) are nearly finished versions that undergo final testing
> and voting before the official release.
>
> So in our case, we have two options:
>
>
>1. Skip mentioning of the Preview and focus on "We are intending to
>gather feedback on version 4 by releasing an earlier version to the
>community for look and feel feedback, especially focused on APIs
>2. Mention Preview in the form. "There will be a Preview release with
>the aim of gathering feedback from the community focused on APIs"
>
> IMO Preview release does not require a formal vote. Preview releases are
> often considered experimental or pre-alpha versions and are not expected to
> meet the same level of stability and completeness as release candidates or
> final releases.
>
> HTH
>
> Mich Talebzadeh,
> Technologist | Architect | Data Engineer  | Generative AI | FinCrime
> London
> United Kingdom
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 6 May 2024 at 14:10, Mich Talebzadeh 
> wrote:
>
>> @Wenchen Fan 
>>
>> Thanks for the update! To clarify, is the vote for approving a specific
>> preview build, or is it for moving towards an RC stage? I gather there is a
>> distinction between these two?
>>
>>
>> Mich Talebzadeh,
>> Technologist | Architect | Data Engineer  | Generative AI | FinCrime
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Mon, 6 May 2024 at 13:03, Wenchen Fan  wrote:
>>
>>> The preview release also needs a vote. I'll try my best to cut the RC on
>>> Monday, but the actual release may take some time. Hopefully, we can get it
>>> out this week but if the vote fails, it will take longer as we need more
>>> RCs.
>>>
>>> On Mon, May 6, 2024 at 7:22 AM Dongjoon Hyun 
>>> wrote:
>>>
>>>> +1 for Holden's comment. Yes, it would be great to mention `it` as
>>>> "soon".
>>>> (If Wenchen release it on Monday, we can simply mention the release)
>>>>
>>>> In addition, Apache Spark PMC received an official notice from ASF
>>>> Infra team.
>>>>
>>>> https://lists.apache.org/thread/rgy1cg17tkd3yox7qfq87ht12sqclkbg
>>>> > [NOTICE] Apache Spark's GitHub Actions usage exceeds allowances for
>>>> ASF projects
>>>>
>>>> To track and comply with the new ASF Infra Policy as much as possible,
>>>> we opened a blocker-level JIRA issue and have been working on it.
>>>> - https://infra.apache.org/github-actions-policy.html
>>>>
>>>> Please include a sentence that Apache Spark PMC is working on under the
>>>> following umbrella JIRA issue.
>>>>
>>>> https://issues.apache.org/jira/browse/SPARK-48094
>>>> > Reduce GitHub Action usage according to ASF project allowance
>>>>
>>>> Thanks,
>>>> Dongjoon.
>>>>
>>>>
>>>> On Sun, May 5, 2024 at 3:45 PM Holden Karau 
&

Re: ASF board report draft for May

2024-05-06 Thread Holden Karau
If folks are against the term soon we could say “in-progress”

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Mon, May 6, 2024 at 2:08 AM Mich Talebzadeh 
wrote:

> Hi,
>
> We should reconsider using the term "soon" for ASF board as it is
> subjective with no date (assuming this is an official communication on
> Wednesday). We ought to say
>
>  "Spark 4, the next major release after Spark 3.x, is currently under
> development. We plan to make a preview version available for evaluation as
> soon as it is feasible"
>
> HTH
>
> Mich Talebzadeh,
> Technologist | Architect | Data Engineer  | Generative AI | FinCrime
> London
> United Kingdom
>
>
>view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Mon, 6 May 2024 at 05:09, Dongjoon Hyun 
> wrote:
>
>> +1 for Holden's comment. Yes, it would be great to mention `it` as
>> "soon".
>> (If Wenchen release it on Monday, we can simply mention the release)
>>
>> In addition, Apache Spark PMC received an official notice from ASF Infra
>> team.
>>
>> https://lists.apache.org/thread/rgy1cg17tkd3yox7qfq87ht12sqclkbg
>> > [NOTICE] Apache Spark's GitHub Actions usage exceeds allowances for ASF
>> projects
>>
>> To track and comply with the new ASF Infra Policy as much as possible, we
>> opened a blocker-level JIRA issue and have been working on it.
>> - https://infra.apache.org/github-actions-policy.html
>>
>> Please include a sentence that Apache Spark PMC is working on under the
>> following umbrella JIRA issue.
>>
>> https://issues.apache.org/jira/browse/SPARK-48094
>> > Reduce GitHub Action usage according to ASF project allowance
>>
>> Thanks,
>> Dongjoon.
>>
>>
>> On Sun, May 5, 2024 at 3:45 PM Holden Karau 
>> wrote:
>>
>>> Do we want to include that we’re planning on having a preview release of
>>> Spark 4 so folks can see the APIs “soon”?
>>>
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>>
>>> On Sun, May 5, 2024 at 3:24 PM Matei Zaharia 
>>> wrote:
>>>
>>>> It’s time for our quarterly ASF board report on Apache Spark this
>>>> Wednesday. Here’s a draft, feel free to suggest changes.
>>>>
>>>> 
>>>>
>>>> Description:
>>>>
>>>> Apache Spark is a fast and general purpose engine for large-scale data
>>>> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
>>>> well as a rich set of libraries including stream processing, machine
>>>> learning, and graph analytics.
>>>>
>>>> Issues for the board:
>>>>
>>>> - None
>>>>
>>>> Project status:
>>>>
>>>> - We made two patch releases: Spark 3.5.1 on February 28, 2024, and
>>>> Spark 3.4.2 on April 18, 2024.
>>>> - The votes on "SPIP: Structured Logging Framework for Apache Spark"
>>>> and "Pure Python Package in PyPI (Spark Connect)" have passed.
>>>> - The votes for two behavior changes have passed: "SPARK-4: Use
>>>> ANSI SQL mode by default" and "SPARK-46122: Set
>>>> spark.sql.legacy.createHiveTableByDefault to false".
>>>> - The community decided that upcoming Spark 4.0 release will drop
>>>> support for Python 3.8.
>>>> - We started a discussion about the definition of behavior changes that
>>>> is critical for version upgrades and user experience.
>>>> - We've opened a dedicated repository for the Spark Kubernetes Operator
>>>> at https://github.com/apache/spark-kubernetes-operator. We added a new
>>>> version in Apache Spark JIRA for versioning of the Spark operator based on
>>>> a vote result.
>>>>
>>>> Trademarks:
>>>>
>>>> - No changes since the last report.
>>>>
>>>> Latest releases:
>>>> - Spark 3.4.3 was released on April 18, 2024
>>>> - Spark 3.5.1 was released on February 28, 2024
>>>> - Spark 3.3.4 was released on December 16, 2023
>>>>
>>>> Committers and PMC:
>>>>
>>>> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
>>>> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
>>>> Yikun Jiang).
>>>>
>>>> 
>>>> -
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>


Re: ASF board report draft for May

2024-05-05 Thread Holden Karau
Do we want to include that we’re planning on having a preview release of
Spark 4 so folks can see the APIs “soon”?

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Sun, May 5, 2024 at 3:24 PM Matei Zaharia 
wrote:

> It’s time for our quarterly ASF board report on Apache Spark this
> Wednesday. Here’s a draft, feel free to suggest changes.
>
> 
>
> Description:
>
> Apache Spark is a fast and general purpose engine for large-scale data
> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
> well as a rich set of libraries including stream processing, machine
> learning, and graph analytics.
>
> Issues for the board:
>
> - None
>
> Project status:
>
> - We made two patch releases: Spark 3.5.1 on February 28, 2024, and Spark
> 3.4.2 on April 18, 2024.
> - The votes on "SPIP: Structured Logging Framework for Apache Spark" and
> "Pure Python Package in PyPI (Spark Connect)" have passed.
> - The votes for two behavior changes have passed: "SPARK-4: Use ANSI
> SQL mode by default" and "SPARK-46122: Set
> spark.sql.legacy.createHiveTableByDefault to false".
> - The community decided that upcoming Spark 4.0 release will drop support
> for Python 3.8.
> - We started a discussion about the definition of behavior changes that is
> critical for version upgrades and user experience.
> - We've opened a dedicated repository for the Spark Kubernetes Operator at
> https://github.com/apache/spark-kubernetes-operator. We added a new
> version in Apache Spark JIRA for versioning of the Spark operator based on
> a vote result.
>
> Trademarks:
>
> - No changes since the last report.
>
> Latest releases:
> - Spark 3.4.3 was released on April 18, 2024
> - Spark 3.5.1 was released on February 28, 2024
> - Spark 3.3.4 was released on December 16, 2023
>
> Committers and PMC:
>
> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
> Yikun Jiang).
>
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


[LincolnTalk] Summer Rental in Maine

2024-05-05 Thread sarah cannon holden
Please email me directly if you are interested in this rental.  Sarah


*SOLACE  BY  THE  SEA*

SUMMER  RENTAL

*S. HARPSWELL, MAINE (near Brunswick)*



Cottage that sleeps 4 adults and 2 children

( + smaller cottage that sleeps 2).

June – 1 week:  $1907.50 ( + $763.00  for 2nd cottage)

July – 1 week:   $3242.75 for both cottages

Saturday to Saturday rental

Full kitchen/laundry/bath + outdoor shower.

1 king, 1 queen, 1 bunk bed ( + 2 twins)

Comfortable, relaxing and informal.  Private dock.

Outdoor grill/picnic table.  Peaceful.

By water’s edge, views and lobster boat activity.

Nearby restaurant.  Nearby hiking trails.  Rowboat.
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[jira] [Updated] (SPARK-48101) When using INSERT OVERWRITE with Spark CTEs they may not be fully resolved

2024-05-02 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-48101:
-
Priority: Minor  (was: Major)

> When using INSERT OVERWRITE with Spark CTEs they may not be fully resolved
> --
>
> Key: SPARK-48101
> URL: https://issues.apache.org/jira/browse/SPARK-48101
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.4.0, 3.5.1
>Reporter: Holden Karau
>Priority: Minor
>
> Repro:
> ```sql
> DROP TABLE IF EXISTS local.cte1;
> DROP TABLE IF EXISTS local.cte2;
> DROP TABLE IF EXISTS local.cte3;
> CREATE TABLE local.cte1 (id INT, fname STRING);
> CREATE TABLE local.cte2 (id2 INT);
> CREATE TABLE local.cte3 (id INT);
> WITH test_fake AS (SELECT * FROM local.cte1 WHERE id = 1 AND id2 = 1), 
> test_fake2 AS (SELECT * FROM local.cte2 WHERE id2 = 1) INSERT OVERWRITE TABLE 
> local.cte3 SELECT id2 as id FROM test_fake2;
> WITH test_fake AS (SELECT * FROM local.cte1 WHERE id = 1 AND id2 = 1), 
> test_fake2 AS (SELECT * FROM local.cte2 WHERE id2 = 1) SELECT id2 as id FROM 
> test_fake2;
> ```
>  
> Here we would expect both of the last two SQL expressions to fail, but 
> instead only the first one does.
>  
> There are more complicated cases, and in those cases, the invalid CTE is 
> treated as a null table, but this is the simplest repro I've been able to 
> come up with so far.
>  
> This occurs using both local w/Iceberg catalog or the SparkSession catalog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48101) When using INSERT OVERWRITE with Spark CTEs they may not be fully resolved

2024-05-02 Thread Holden Karau (Jira)
Holden Karau created SPARK-48101:


 Summary: When using INSERT OVERWRITE with Spark CTEs they may not 
be fully resolved
 Key: SPARK-48101
 URL: https://issues.apache.org/jira/browse/SPARK-48101
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.5.1, 3.4.0, 3.3.0
Reporter: Holden Karau


Repro:

```sql

DROP TABLE IF EXISTS local.cte1;
DROP TABLE IF EXISTS local.cte2;
DROP TABLE IF EXISTS local.cte3;
CREATE TABLE local.cte1 (id INT, fname STRING);
CREATE TABLE local.cte2 (id2 INT);
CREATE TABLE local.cte3 (id INT);
WITH test_fake AS (SELECT * FROM local.cte1 WHERE id = 1 AND id2 = 1), 
test_fake2 AS (SELECT * FROM local.cte2 WHERE id2 = 1) INSERT OVERWRITE TABLE 
local.cte3 SELECT id2 as id FROM test_fake2;
WITH test_fake AS (SELECT * FROM local.cte1 WHERE id = 1 AND id2 = 1), 
test_fake2 AS (SELECT * FROM local.cte2 WHERE id2 = 1) SELECT id2 as id FROM 
test_fake2;

```

 

Here we would expect both of the last two SQL expressions to fail, but instead 
only the first one does.

 

There are more complicated cases, and in those cases, the invalid CTE is 
treated as a null table, but this is the simplest repro I've been able to come 
up with so far.

 

This occurs using both local w/Iceberg catalog or the SparkSession catalog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Holden Karau
+1 :) yay previews

On Wed, May 1, 2024 at 5:36 PM Chao Sun  wrote:

> +1
>
> On Wed, May 1, 2024 at 5:23 PM Xiao Li  wrote:
>
>> +1 for next Monday.
>>
>> We can do more previews when the other features are ready for preview.
>>
>> Tathagata Das  于2024年5月1日周三 08:46写道:
>>
>>> Next week sounds great! Thank you Wenchen!
>>>
>>> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan  wrote:
>>>
 Yea I think a preview release won't hurt (without a branch cut). We
 don't need to wait for all the ongoing projects to be ready. How about we
 do a 4.0 preview release based on the current master branch next Monday?

 On Wed, May 1, 2024 at 11:06 PM Tathagata Das <
 tathagata.das1...@gmail.com> wrote:

> Hey all,
>
> Reviving this thread, but Spark master has already accumulated a huge
> amount of changes.  As a downstream project maintainer, I want to really
> start testing the new features and other breaking changes, and it's hard 
> to
> do that without a Preview release. So the sooner we make a Preview 
> release,
> the faster we can start getting feedback for fixing things for a great
> Spark 4.0 final release.
>
> So I urge the community to produce a Spark 4.0 Preview soon even if
> certain features targeting the Delta 4.0 release are still incomplete.
>
> Thanks!
>
>
> On Wed, Apr 17, 2024 at 8:35 AM Wenchen Fan 
> wrote:
>
>> Thank you all for the replies!
>>
>> To @Nicholas Chammas  : Thanks for
>> cleaning up the error terminology and documentation! I've merged the 
>> first
>> PR and let's finish others before the 4.0 release.
>> To @Dongjoon Hyun  : Thanks for driving the
>> ANSI on by default effort! Now the vote has passed, let's flip the config
>> and finish the DataFrame error context feature before 4.0.
>> To @Jungtaek Lim  : Ack. We can treat
>> the Streaming state store data source as completed for 4.0 then.
>> To @Cheng Pan  : Yea we definitely should have
>> a preview release. Let's collect more feedback on the ongoing projects 
>> and
>> then we can propose a date for the preview release.
>>
>> On Wed, Apr 17, 2024 at 1:22 PM Cheng Pan  wrote:
>>
>>> will we have preview release for 4.0.0 like we did for 2.0.0 and
>>> 3.0.0?
>>>
>>> Thanks,
>>> Cheng Pan
>>>
>>>
>>> > On Apr 15, 2024, at 09:58, Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>> >
>>> > W.r.t. state data source - reader (SPARK-45511), there are several
>>> follow-up tickets, but we don't plan to address them soon. The current
>>> implementation is the final shape for Spark 4.0.0, unless there are 
>>> demands
>>> on the follow-up tickets.
>>> >
>>> > We may want to check the plan for transformWithState - my
>>> understanding is that we want to release the feature to 4.0.0, but there
>>> are several remaining works to be done. While the tentative timeline for
>>> releasing is June 2024, what would be the tentative timeline for the RC 
>>> cut?
>>> > (cc. Anish to add more context on the plan for transformWithState)
>>> >
>>> > On Sat, Apr 13, 2024 at 3:15 AM Wenchen Fan 
>>> wrote:
>>> > Hi all,
>>> >
>>> > It's close to the previously proposed 4.0.0 release date (June
>>> 2024), and I think it's time to prepare for it and discuss the ongoing
>>> projects:
>>> > •
>>> > ANSI by default
>>> > • Spark Connect GA
>>> > • Structured Logging
>>> > • Streaming state store data source
>>> > • new data type VARIANT
>>> > • STRING collation support
>>> > • Spark k8s operator versioning
>>> > Please help to add more items to this list that are missed here. I
>>> would like to volunteer as the release manager for Apache Spark 4.0.0 if
>>> there is no objection. Thank you all for the great work that fills Spark
>>> 4.0!
>>> >
>>> > Wenchen Fan
>>>
>>>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [LincolnTalk] Car mechanic

2024-04-29 Thread Sarah Cannon Holden
And another plug for Marconi’s!! Sarah Cannon HoldenOn Apr 29, 2024, at 9:14 AM, Margo Fisher-Martin  wrote:I also recommend Marconi’s. Frankie from T  is also still there some days.All good!MargoOn Mon, Apr 29, 2024 at 9:06 AM Rob Haslinger  wrote:Second plug for Marconi. They are fair, do solid work and  Erica, their admin, is extremely responsive and helpful.On Mon, Apr 29, 2024 at 8:54 AM Deb Wallace  wrote:Judy -- Marconi's on Rt. 117, formerly Joey's Auto. He was trained by Joey who sold him the business after he retired. Marconi is an excellent mechanic and always fair. 781-259-9794Deb
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at https://pairlist9.pair.net/mailman/listinfo/lincoln.


-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at https://pairlist9.pair.net/mailman/listinfo/lincoln.


-- The LincolnTalk mailing list.To post, send mail to Lincoln@lincolntalk.org.Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.Change your subscription settings at https://pairlist9.pair.net/mailman/listinfo/lincoln.-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Holden Karau
+1

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Fri, Apr 26, 2024 at 12:06 PM L. C. Hsieh  wrote:

> +1
>
> On Fri, Apr 26, 2024 at 10:01 AM Dongjoon Hyun 
> wrote:
> >
> > I'll start with my +1.
> >
> > Dongjoon.
> >
> > On 2024/04/26 16:45:51 Dongjoon Hyun wrote:
> > > Please vote on SPARK-46122 to set
> spark.sql.legacy.createHiveTableByDefault
> > > to `false` by default. The technical scope is defined in the following
> PR.
> > >
> > > - DISCUSSION:
> > > https://lists.apache.org/thread/ylk96fg4lvn6klxhj6t6yh42lyqb8wmd
> > > - JIRA: https://issues.apache.org/jira/browse/SPARK-46122
> > > - PR: https://github.com/apache/spark/pull/46207
> > >
> > > The vote is open until April 30th 1AM (PST) and passes
> > > if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> > >
> > > [ ] +1 Set spark.sql.legacy.createHiveTableByDefault to false by
> default
> > > [ ] -1 Do not change spark.sql.legacy.createHiveTableByDefault because
> ...
> > >
> > > Thank you in advance.
> > >
> > > Dongjoon
> > >
> >
> > -
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [FYI] SPARK-47993: Drop Python 3.8

2024-04-25 Thread Holden Karau
+1

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Thu, Apr 25, 2024 at 11:18 AM Maciej  wrote:

> +1
>
> Best regards,
> Maciej Szymkiewicz
>
> Web: https://zero323.net
> PGP: A30CEF0C31A501EC
>
> On 4/25/24 6:21 PM, Reynold Xin wrote:
>
> +1
>
> On Thu, Apr 25, 2024 at 9:01 AM Santosh Pingale
>  
> wrote:
>
>> +1
>>
>> On Thu, Apr 25, 2024, 5:41 PM Dongjoon Hyun 
>> wrote:
>>
>>> FYI, there is a proposal to drop Python 3.8 because its EOL is October
>>> 2024.
>>>
>>> https://github.com/apache/spark/pull/46228
>>> [SPARK-47993][PYTHON] Drop Python 3.8
>>>
>>> Since it's still alive and there will be an overlap between the
>>> lifecycle of Python 3.8 and Apache Spark 4.0.0, please give us your
>>> feedback on the PR, if you have any concerns.
>>>
>>> From my side, I agree with this decision.
>>>
>>> Thanks,
>>> Dongjoon.
>>>
>>


[Bug 2018504] Re: cups-browsed is using an excessive amount of CPU

2024-04-19 Thread Holden Karau
+1 also running into this
If I restart cups the issue goes away for "awhile" though (interestingly 
printing does not seem to impact cups meaning it's probably behavior that is 
unrelated to the printing).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2018504

Title:
  cups-browsed is using an excessive amount of CPU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cups-browsed/+bug/2018504/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-13 Thread Holden Karau
+1 -- even if it's not perfect now is the time to change default values

On Sat, Apr 13, 2024 at 4:11 PM Hyukjin Kwon  wrote:

> +1
>
> On Sun, Apr 14, 2024 at 7:46 AM Chao Sun  wrote:
>
>> +1.
>>
>> This feature is very helpful for guarding against correctness issues,
>> such as null results due to invalid input or math overflows. It’s been
>> there for a while now and it’s a good time to enable it by default as Spark
>> enters the next major release.
>>
>> On Sat, Apr 13, 2024 at 3:27 PM Dongjoon Hyun 
>> wrote:
>>
>>> I'll start from my +1.
>>>
>>> Dongjoon.
>>>
>>> On 2024/04/13 22:22:05 Dongjoon Hyun wrote:
>>> > Please vote on SPARK-4 to use ANSI SQL mode by default.
>>> > The technical scope is defined in the following PR which is
>>> > one line of code change and one line of migration guide.
>>> >
>>> > - DISCUSSION:
>>> > https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz
>>> > - JIRA: https://issues.apache.org/jira/browse/SPARK-4
>>> > - PR: https://github.com/apache/spark/pull/46013
>>> >
>>> > The vote is open until April 17th 1AM (PST) and passes
>>> > if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>> >
>>> > [ ] +1 Use ANSI SQL mode by default
>>> > [ ] -1 Do not use ANSI SQL mode by default because ...
>>> >
>>> > Thank you in advance.
>>> >
>>> > Dongjoon
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Introducing Apache Gluten(incubating), a middle layer to offload Spark to native engine

2024-04-10 Thread Holden Karau
On Wed, Apr 10, 2024 at 9:54 PM Binwei Yang  wrote:

>
> Gluten currently already support Velox backend and Clickhouse backend.
> data fusion support is also proposed but no one worked on it.
>
> Gluten isn't a POC. It's under actively developing but some companies
> already used it.
>
>
> On 2024/04/11 03:32:01 Dongjoon Hyun wrote:
> > I'm interested in your claim.
> >
> > Could you elaborate or provide some evidence for your claim, *a door for
> > all native libraries*, Binwei?
> >
> > For example, is there any POC for that claim? Maybe, did I miss something
> > in that SPIP?
>
I think the concern here is there are multiple different layers to get from
Spark -> Native code and ideally any changes we introduce in Spark would be
for common functionality that is useful across them (e.g. data fusion comet
& gluten & photon*, etc.)


* Photon being harder to guess at since it's closed source.

> >
> > Dongjoon.
> >
> > On Wed, Apr 10, 2024 at 8:19 PM Binwei Yang  wrote:
> >
> > >
> > > The SPIP is not for current Gluten, but open a door for all native
> > > libraries and accelerators support.
> > >
> > > On 2024/04/11 00:27:43 Weiting Chen wrote:
> > > > Yes, the 1st Apache release(v1.2.0) for Gluten will be in September.
> > > > For Spark version support, currently Gluten v1.1.1 support Spark3.2
> and
> > > 3.3.
> > > > We are planning to support Spark3.4 and 3.5 in Gluten v1.2.0.
> > > > Spark4.0 support for Gluten is depending on the release schedule in
> > > Spark community.
> > > >
> > > > On 2024/04/09 07:14:13 Dongjoon Hyun wrote:
> > > > > Thank you for sharing, Weiting.
> > > > >
> > > > > Do you think you can share the future milestone of Apache Gluten?
> > > > > I'm wondering when the first stable release will come and how we
> can
> > > > > coordinate across the ASF communities.
> > > > >
> > > > > > This project is still under active development now, and doesn't
> have
> > > a
> > > > > stable release.
> > > > > > https://github.com/apache/incubator-gluten/releases/tag/v1.1.1
> > > > >
> > > > > In the Apache Spark community, Apache Spark 3.2 and 3.3 is the end
> of
> > > > > support.
> > > > > And, 3.4 will have 3.4.3 next week and 3.4.4 (another EOL release)
> is
> > > > > scheduled in October.
> > > > >
> > > > > For the SPIP, I guess it's applicable for Apache Spark 4.0.0 only
> if
> > > there
> > > > > is something we need to do from Spark side.
> > > > >
> > > > > Thanks,
> > > > > Dongjoon.
> > > > >
> > > > >
> > > > > On Mon, Apr 8, 2024 at 11:19 PM WeitingChen <
> weitingc...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > We are excited to introduce a new Apache incubating project
> called
> > > Gluten.
> > > > > > Gluten serves as a middleware layer designed to offload Spark to
> > > native
> > > > > > engines like Velox or ClickHouse.
> > > > > > For more detailed information, please visit the project
> repository at
> > > > > > https://github.com/apache/incubator-gluten
> > > > > >
> > > > > > Additionally, a new Spark SPIP related to Spark + Gluten
> > > collaboration has
> > > > > > been proposed at
> https://issues.apache.org/jira/browse/SPARK-47773.
> > > > > > We eagerly await feedback from the Spark community.
> > > > > >
> > > > > > Thanks,
> > > > > > Weiting.
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > -
> > > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> > > >
> > > >
> > >
> > > -
> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> > >
> > >
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: SPIP: Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on Various Native Engines

2024-04-09 Thread Holden Karau
I like the idea of improving flexibility of Sparks physical plans and
really anything that might reduce code duplication among the ~4 or so
different accelerators.

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Tue, Apr 9, 2024 at 3:14 AM Dongjoon Hyun 
wrote:

> Thank you for sharing, Jia.
>
> I have the same questions like the previous Weiting's thread.
>
> Do you think you can share the future milestone of Apache Gluten?
> I'm wondering when the first stable release will come and how we can
> coordinate across the ASF communities.
>
> > This project is still under active development now, and doesn't have a
> stable release.
> > https://github.com/apache/incubator-gluten/releases/tag/v1.1.1
>
> In the Apache Spark community, Apache Spark 3.2 and 3.3 is the end of
> support.
> And, 3.4 will have 3.4.3 next week and 3.4.4 (another EOL release) is
> scheduled in October.
>
> For the SPIP, I guess it's applicable for Apache Spark 4.0.0 only if there
> is something we need to do from Spark side.
>
+1 I think any changes need to target 4.0

>
> Thanks,
> Dongjoon.
>
>
> On Tue, Apr 9, 2024 at 12:22 AM Ke Jia  wrote:
>
>> Apache Spark currently lacks an official mechanism to support
>> cross-platform execution of physical plans. The Gluten project offers a
>> mechanism that utilizes the Substrait standard to convert and optimize
>> Spark's physical plans. By introducing Gluten's plan conversion,
>> validation, and fallback mechanisms into Spark, we can significantly
>> enhance the portability and interoperability of Spark's physical plans,
>> enabling them to operate across a broader spectrum of execution
>> environments without requiring users to migrate, while also improving
>> Spark's execution efficiency through the utilization of Gluten's advanced
>> optimization techniques. And the integration of Gluten into Spark has
>> already shown significant performance improvements with ClickHouse and
>> Velox backends and has been successfully deployed in production by several
>> customers.
>>
>> References:
>> JIAR Ticket 
>> SPIP Doc
>> 
>>
>> Your feedback and comments are welcome and appreciated.  Thanks.
>>
>> Thanks,
>> Jia Ke
>>
>


Re: Apache Spark 3.4.3 (?)

2024-04-06 Thread Holden Karau
Sounds good to me :)

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Sat, Apr 6, 2024 at 2:51 PM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
> commits including important security and correctness patches like
> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>
> https://github.com/apache/spark/releases/tag/v3.4.2
>
> $ git log --oneline v3.4.2..HEAD | wc -l
>   85
>
> SPARK-45580 Subquery changes the output schema of the outer query
> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
> results
> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
> ntz
> SPARK-46794 Incorrect results due to inferred predicate from checkpoint
> with subquery
> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> SPARK-45445 Upgrade snappy to 1.1.10.5
> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> SPARK-46239 Hide `Jetty` info
>
>
> Currently, I'm checking more applicable patches for branch-3.4. I'd like
> to propose to release Apache Spark 3.4.3 and volunteer as the release
> manager for Apache Spark 3.4.3. If there are no additional blockers, the
> first tentative RC1 vote date is April 15th (Monday).
>
> WDYT?
>
>
> Dongjoon.
>


Re: [VOTE] SPIP: Pure Python Package in PyPI (Spark Connect)

2024-04-01 Thread Holden Karau
+1

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Mon, Apr 1, 2024 at 5:44 PM Xinrong Meng  wrote:

> +1
>
> Thank you @Hyukjin Kwon 
>
> On Mon, Apr 1, 2024 at 10:19 AM Felix Cheung 
> wrote:
>
>> +1
>> --
>> *From:* Denny Lee 
>> *Sent:* Monday, April 1, 2024 10:06:14 AM
>> *To:* Hussein Awala 
>> *Cc:* Chao Sun ; Hyukjin Kwon ;
>> Mridul Muralidharan ; dev 
>> *Subject:* Re: [VOTE] SPIP: Pure Python Package in PyPI (Spark Connect)
>>
>> +1 (non-binding)
>>
>>
>> On Mon, Apr 1, 2024 at 9:24 AM Hussein Awala  wrote:
>>
>> +1(non-binding) I add to the difference will it make that it will also
>> simplify package maintenance and easily release a bug fix/new feature
>> without needing to wait for Pyspark to release.
>>
>> On Mon, Apr 1, 2024 at 4:56 PM Chao Sun  wrote:
>>
>> +1
>>
>> On Sun, Mar 31, 2024 at 10:31 PM Hyukjin Kwon 
>> wrote:
>>
>> Oh I didn't send the discussion thread out as it's pretty simple,
>> non-invasive and the discussion was sort of done as part of the Spark
>> Connect initial discussion ..
>>
>> On Mon, Apr 1, 2024 at 1:59 PM Mridul Muralidharan 
>> wrote:
>>
>>
>> Can you point me to the SPIP’s discussion thread please ?
>> I was not able to find it, but I was on vacation, and so might have
>> missed this …
>>
>>
>> Regards,
>> Mridul
>>
>>
>> On Sun, Mar 31, 2024 at 9:08 PM Haejoon Lee
>>  wrote:
>>
>> +1
>>
>> On Mon, Apr 1, 2024 at 10:15 AM Hyukjin Kwon 
>> wrote:
>>
>> Hi all,
>>
>> I'd like to start the vote for SPIP: Pure Python Package in PyPI (Spark
>> Connect)
>>
>> JIRA 
>> Prototype 
>> SPIP doc
>> 
>>
>> Please vote on the SPIP for the next 72 hours:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Thanks.
>>
>>


[jira] [Created] (SPARK-47672) Avoid double evaluation of non-trivial projected elements from filter pushdown

2024-04-01 Thread Holden Karau (Jira)
Holden Karau created SPARK-47672:


 Summary: Avoid double evaluation of non-trivial projected elements 
from filter pushdown
 Key: SPARK-47672
 URL: https://issues.apache.org/jira/browse/SPARK-47672
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.1
Reporter: Holden Karau


Repro here [https://gist.github.com/holdenk/0f9660bcbd9e63aaff904f15d3439db1] 
 
You can work around this by setting an expensive UDF to non-deterministic but 
that's not ideal and won't fix expensive internal operations (like string 
matching).
 
Instead when we go to bubble up a filter, if we should not move a filter up 
above a projection of what we are filtering on.
 
https://issues.apache.org/jira/browse/SPARK-40045 partially fixed some of this 
by (roughly) ordering filter expressions by cost so that we're not evaluating 
more than ~2x (e.g. in old behavior bubbled up filter could become the first 
elem of the filter and then the cheap null checks would go away and we'd have 
expensive compute on everything not just filtered data), but we should "trust" 
the users projection + later use of that projection to indicate that a UDF is 
expensive and we should only evaluate it once inside of the projection and 
filter after.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [RBW] Re: How do I know when a saddle fits?

2024-03-31 Thread Anthony Holden
+1 for the Rivet Pearl. I've got one on a Soma Double Cross, and the
version with the cutout is indeed an excellent fit for several positions
fore and aft!

On Sat, Mar 30, 2024, 3:25 PM ascpgh  wrote:

> Emily, I have dealt with roughly your sort of physiological dimensioning
> my whole riding life and currently have three variations of coping, each
> with saddles that bear varying results. Those results have inseparable
> connection to how well each of the bikes they are on fits me.
>
> My commuter is a stock Surly Disc Trucker "box bike". I added a Brooks
> B-17, pedals , Nitto RM 013 handlebar, dyno hub/ lights and a shorter a
> stem. The frame size that produced the best reach to the handlebars
> required a long  extension of the seat post and would have required a
> pretty high angle stem for those bars to be level with the seat, my comfort
> zone, and would also need significant setback dimension of the seat post
> head. The frame size that gives the best pedaling position requires a short
> stem to keep me from reaching, sort of. I still feel like I reach for the
> bars on that bike and do not ride it more than 20 miles. Even on the bigger
> size I find my legs drive me to slide back on the seat, onto the cantle
> (that metal thing) for many climbs before I pedal out of the saddle. Lots
> of compromises but it's my 14 hour lock up bike.
>
> My Rivendell Rambouillet was the best stock bike fit to my body I'd ever
> experienced and prevented me from going custom. Grant envisioned it as a
> long hours in the saddle sporty/light touring bike in the French
> audax/randonnour-inspired design for comfort over hours of riding. Shorter
> top tube than seat tube, with 2° upslope and 2 cm extended top head tube
> lug all conspired to provide this. It all conspires to fitting me well. It
> was a stock build kit from Riv with the B-17, RM 013 bars, I added the
> fenders, and changed the derailleurs, shifters and brakes. I do pedal from
> the saddle quite a bit more than others in groups before getting out of it
> and standing for hills. I still find myself sliding back on the seat for a
> rearward position to get some pushing forward on the pedal strokes when
> going uphill. That puts me on that cantle again, less than on the commuter
> but with the bars in more comfortable reach.
>
> I finally did go custom to for the sort of riding I have available and
> enjoy from my front door. I've refined what I look for in saddles,
> acknowledging that I do stay on the saddle across more terrain than others,
> scooting rearward for that pushing bit where others pop up, pedaling out of
> their saddles. That fore and aft position range has made me a connoisseur
> of saddle tops that have a platform of surface wide enough for my sit bones
> but also retaining that in the longitudal dimension of my back and forth
> positioning. I want that platform to be level and I don't want extra
> material rubbing my legs.
>
> I am using a Rivet Pearl with cut out on my custom bike. The cut out lets
> the centerline of the leather "hammock" between the nose and cantle without
> the same amount of weight my sit bones applied to those spots that breaks
> in the points where they do bear weight. Without the cut out, that leather
> remains a linear high ridge from front to back where the less skeletal
> portions of my rear end are perched. I thought I had picked perfectly when
> I chose the Rivet Diablo but after three months' riding and several
> centuries everything was breaking in nicely except for that ridge line down
> the center and it was creating discomfort. They were quick to respond to my
> issue and sent the the cut out version of the Pearl which has been perfect
> ever since.
>
> Hope this is of some help to your situation.
>
> Andy Cheatham
> Pittsburgh
>
>
>
>
>
>
>
> On Thursday, March 28, 2024 at 11:46:29 PM UTC-4 Emily Guise wrote:
>
>> Hey all, thanks so much for your insights! I'm local to Portland OR, and
>> there is a bike fitter in town, Pedal PT, who also does physical therapy.
>> I've been wondering if I should get a fit with them, and it seems like I
>> should look into it more seriously.
>>
>> My travel/adventure/distance bike is a Bike Friday, and that's the one
>> I'd get fit. I do tend to like the flatter saddles, and usually ride with
>> the nose titled up. A challenge is that I have very long arms and legs but
>> a shorter torso. Anyone with a similar body type have any advice?
>>
>> I have tried women's specific saddles- I tried a Terry Liberator for a
>> while, but it was just SO hard, even though the cutout was fantastic. The
>> same with the Brookses, I always felt like I was sitting on the metal edge
>> or the leather was as unforgiving as wood and as uncomfortable. I'm trying
>> out Riv's new plastic saddle on my Platypus right now. It's sort of
>> comfortable but also feels maybe not quite wide enough. I'll have to give
>> it a few more weeks.
>>
>>
>> On Thursday, March 28, 2024 at 7:22:12 AM UTC-7 

Re: [RBW] I have questions

2024-03-26 Thread Anthony Holden
Oh, man. Those wheels are gonna look ACE!!

I'll chime in anecdotally that the difference between 42-48 isn't super 
noticeable as long as you get your PSI where it feels comfy for you. Have 
an amazing time on that 2-day ride.

As for front rack security, I don't use a strap, but I also check bolt 
tension fairly regularly. The straps are ugly, but are great for peace of 
mind if you don't check your bolts always before getting out on a ride.

On Wednesday, March 20, 2024 at 9:21:08 AM UTC-7 J wrote:

> You don't say which Gravel King model you are using, but I see in your 
> Philly post that you have Ultradynamico Cava tires on your bike. So maybe 
> you run the file tread GK? Anyhow, I rode through 2 sets of 700x42 Gravel 
> King SK on my old Sam Hillbourne before moving up to 700x50 which just 
> barely fit. I thought I'd notice a big difference but it turned out not to 
> be true, as long as I kept the air pressure up. I only have 650b bikes now, 
> and don't ride Gravel King SK after discovering the Rene Herse file tread 
> much smoother and faster "feeling". I've switched back and forth from 42 
> and 48mm RH file treads as well as 42 Gran Bois and have settled on 48mm RH 
> (Switchback Hill) which measures quite a bit over 48mm on my wheels. The 
> 42mm tires gave the perception that I was faster but the strava data did 
> not corroborate, and the 48mm have so much lovely float over gravel 
> compared to anything narrower or with tooth, I figured why bother? YMMV but 
> I think 48s won't be an issue. If my words sway you at all towards RH, just 
> keep in mind that they are not great in wet conditions with steep descents 
> combined with rim brakes. I learned this twice this fall, and kept RH 
> knobbies on until a few days ago. 
>
> mysterious J
>
> On Wednesday, March 20, 2024 at 11:42:19 AM UTC-4 Patrick Moore wrote:
>
>> The 60 mm Schwalbe Big Ones that used to be on my dirt road Matthews were 
>> among the very fastest-rolling tires I've used, including various "racing" 
>> tires and 2 extralight RH models. I'd say that the right 48 mm tire will 
>> roll plenty fast. 
>>
>> I've not used any Gravel Kings.
>>
>> Patrick "it's not my tires that make me slow" Moore
>>
>> On Tue, Mar 19, 2024 at 7:10 PM Bicycle Belle Ding Ding! <
>> jonasa...@gmail.com> wrote:
>>
>>> ... Can 48 mm tires do a 15-17 mph road ride pace? I have 42 on all my 
>>> other bikes. Would 48s be slow? The ride is a 2 day event, 100 miles total. 
>>> I’d like to keep the tires if I could, because they’re new and they are fat 
>>> enough to also double as gravel tires, should I decide to do a gravel ride 
>>> again. But I do more road rides than anything else, and if those 48s will 
>>> cripple me, I’ll go back to 42s. What’s the consensus?
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups "RBW 
Owners Bunch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rbw-owners-bunch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rbw-owners-bunch/826f877f-bad8-442d-9178-6e61ae6e5072n%40googlegroups.com.


[RBW] Re: How do I know when a saddle fits?

2024-03-26 Thread Anthony Holden
Pain is an obvious sign that something is wrong. And I think that's one 
reason why it's easy to tell when a saddle ISN'T a fit. But what feels 
right can be so subjective.

Obviously Riv and Grant are big proponents of Brooks. I've had several 
Brooks saddles, and not every one feels the same. I've had B17s that were 
stiff as a board despite many years and miles of riding, and then more 
recently the B68 that I got with my Appaloosa that felt broken in from the 
first ride. With saddles, like anything else on a bike, YMMV. Despite the 
variety of experiences I've had with Brooks saddles, I've felt they were 
all comfortable in their own way. The key for me has been finding the 
positioning that works for a particular saddle on a particular bike. Moving 
it little by little to find the sweet spot. If I feel myself sliding up the 
nose as I pedal, I consider tilting the nose upward a little. If my knees 
feel out of plumb with my cranks, I shift it forward or aft a hair to find 
a position that works better. It's kind of like dialing in tire pressure. 
Just keep fiddling with it until you find what works for you and the kind 
of riding you do.

Any kind of pain, especially lingering pain (like ongoing numbness 
post-ride or a persistent nerve twinge, for example) is an indication that 
the saddle or its positioning isn't working for you. Normal pain that's 
derived from effort, howeve, is to be expected with any saddle. Riding a 
bike is never a pain-free activity in that sense. Using your muscles, 
putting pressure on your wrists, feet, and sit bones is going to cause some 
discomfort. One thing that helped me a ton is realizing that no bike rider 
(pro, enthusiast, or regular fella like me) is ever riding for any 
significant distance without changing up their posture. Moving around on 
the bike is normal. Shifting weight, pedaling while standing, moving hand 
positions--all these things can affect how a saddle feels for me. 
Especially, like you say, when the distance is over 20 miles or more. Butt 
toughness also tends to build up for me over the season. I'm always a 
little sore after not riding for a while. If I've been riding a lot lately, 
I can do more miles without a sore tush.

Don't know if any of that helps at all, but hopefully it gives you 
something to think about. I'm curious about others' experiences.

On Wednesday, March 20, 2024 at 1:28:42 PM UTC-7 George Schick wrote:

> Oh boy. There are so many variables that go into good saddle fit and 
> comfort that I'm not sure where to begin.  One has to do with the type of 
> shorts (or other garbs) which you wear to ride.  Many on this blog have 
> talked about the thinner the padding in their shorts the more comfy the 
> ride.  Then again, there is the matter of riding position.  If you are 
> riding in a more upright position on a bike with bars that reach way back 
> you will likely put more pressure and possibly friction on your groin area 
> causing discomfort.  There are those who seem to like riding that way - 
> kinda like a rolling leg press machine, putting lots and lots of pressure 
> on the pedals with every stroke in a very high gear, maybe that's how they 
> get by with it - but that's not normative with everyone.  And, of course, 
> there's always the usually undiscussed issue of just how sensitive those 
> lower bones (ischial tuberosity tissues}, and other skin and muscle 
> tissues play a part. 
>
> On Wednesday, March 20, 2024 at 3:00:24 PM UTC-5 Emily Guise wrote:
>
>> Hello folks, I come to the group with a dilemma. I've never had a saddle 
>> that I could ride for longer than 20 miles comfortably. I've always ended 
>> up with sore sit bones, numb soft tissue, or both. This has really limited 
>> my ability to go on longer trips and after my five day ride on the C 
>> canal trail last Sept, it was more apparent than ever I need to find a 
>> saddle that won't hurt. 
>>
>> I've tried dozens of saddles over the last 15 years- leather, plastic, 
>> cutouts, no cutouts, wide, medium, softer, harder, you name it. :( Most of 
>> the saddles that have stayed on my bikes for longer than a month have a 
>> central cut out, are on the wider side, and plastic. They're good for 
>> around town, but that's it. I've never had my sit bones measured. 
>>
>> It occurred to me recently that because I've never had a truly 
>> comfortable long-distance saddle, I have no idea how one feels. So I 
>> figured I'd ask the group. How did The One saddle feel for you? Did it 
>> "disappear"? Was it love at first sit? Did it need to be adjusted a lot 
>> before finding the ideal position? Is there a certain amount of miles you 
>> ride before it becomes uncomfortable? 
>>
>> I'd love to hear the group's collective wisdom so I know what to look for 
>> in the next saddle I try out. Thanks! 
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups "RBW 
Owners Bunch" group.
To unsubscribe from this group and stop 

[RBW] Re: anyone else tried Ritchey Beacon Bars?

2024-03-26 Thread Anthony Holden
Dave!! They look amazing on that Jones. I have an SWB that I've wanted to 
put drop bars on... these might be a candidate. Where do you mount your 
shifter?

On Friday, March 22, 2024 at 12:39:14 PM UTC-7 DavidP wrote:

> Great that the Roadini is working out so well for your son (and wife)!
>
> I have the wider XL version (52cm at the hoods, 67cm at the ends) on my 
> drop bar Jones 29er, which is setup with the drops as the primary position 
> (my bars are set just a bit lower than your son's). Despite the amount of 
> flare I find they are comfortable on the hoods also.
>
> My more roadish gravel bike has a 46cm Salsa Cowchipper.
>
> -Dave
>
> On Friday, March 22, 2024 at 3:08:16 PM UTC-4 pi...@gmail.com wrote:
>
>> I built up my son's Roadini with Ritchey Beacon Comp bars (
>> https://www.amazon.com/photos/shared/Hdny6ViFROaPcQIM_FkEbg.aW9haXdpnlfOy4Dg9_oNzx),
>>  
>> and I've had a few people test ride it. What impressed me about the bar was 
>> that despite purposefully not mentioning anything about the handlebars, 
>> everyone who's used the bike defaults to using the drops automatically. 
>> It's a great position, hybrid between regular drops and straight bars, and 
>> just to show how nice a bike the Roadini is, my wife used it on her commute 
>> a few days and now wants her own Roadini!
>>
>> Like all Grant Petersen bikes, it's the kind of bike where the more you 
>> ride it the more you like it. I've noticed that about his designs since the 
>> Bridgestone RB-1. I still feel that the bike could use a lower BB 
>> (especially when shod with 38mm tires), but riding with 28mm tires makes 
>> the bike feel so agile.
>>
>

-- 
You received this message because you are subscribed to the Google Groups "RBW 
Owners Bunch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rbw-owners-bunch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rbw-owners-bunch/9d7dc10f-5994-4201-86e4-eaa3c788cd48n%40googlegroups.com.


[RBW] Re: Roadini on F1 Track ride report

2024-03-19 Thread Anthony Holden
That sounds like a total blast! I think it speaks well of an event when you 
can see riders of all styles, abilities, and ages coming together to have a 
good time. Love that Roadini. The Safety Pizza and snazzy bar tape 
complement it so well.

On Wednesday, March 13, 2024 at 1:37:05 PM UTC-7 Mike Packard wrote:

> Howdy,
>
> We have a Formula 1 track in Austin called Circuit of the Americas and on 
> many Tuesdays they have a bike night. Last night was the first one of the 
> season. 
>
> The track is about a 3.36 mile loop with 1 very steep (11%) uphill (and 
> corresponding steep downhill). The pavement is so smooth and free of 
> debris, in certain places it makes a satisfying sticky-grippy sound as the 
> tires roll. There is a bypass for the big hill if one does not want to do 
> it every lap.  
>
> Aside from the novelty of riding on an F1 race track, the really special 
> thing about it is it's just nice to ride somewhere without having a single 
> thought about cars or having to stop for any reason. There's so much space. 
> Everyone can ride at the pace they want. There are some really fun slight 
> descent sections that are a blast to pedal hard and get going really fast 
> under my own power (i.e. not just hill induced). Or just toodle around with 
> your legs outstretched singing out loud.
>
> I brought my 57 Roadini and had a lovely time. This time was neat because 
> my friend brought his 8-year-old twins who'd never been before. I was 
> impressed they did the big downhill (I wouldn't have been brave enough at 
> that age.)
>
> Definitely worth checking out if you're within striking range of Austin, 
> especially before it gets too hot.
>
> Mike 
>
> https://circuitoftheamericas.com/bike-night/
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "RBW 
Owners Bunch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rbw-owners-bunch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rbw-owners-bunch/acc41f5f-a05f-4a67-8e07-8bf1e4ec91a8n%40googlegroups.com.


Re: [VOTE] SPIP: Structured Logging Framework for Apache Spark

2024-03-12 Thread Holden Karau
+1

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Mon, Mar 11, 2024 at 7:44 PM Reynold Xin 
wrote:

> +1
>
>
> On Mon, Mar 11 2024 at 7:38 PM, Jungtaek Lim 
> wrote:
>
>> +1 (non-binding), thanks Gengliang!
>>
>> On Mon, Mar 11, 2024 at 5:46 PM Gengliang Wang  wrote:
>>
>>> Hi all,
>>>
>>> I'd like to start the vote for SPIP: Structured Logging Framework for
>>> Apache Spark
>>>
>>> References:
>>>
>>>- JIRA ticket 
>>>- SPIP doc
>>>
>>> 
>>>- Discussion thread
>>>
>>>
>>> Please vote on the SPIP for the next 72 hours:
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>> Thanks!
>>> Gengliang Wang
>>>
>>


[jira] [Created] (SPARK-47220) log4j race condition during shutdown

2024-02-28 Thread Holden Karau (Jira)
Holden Karau created SPARK-47220:


 Summary: log4j race condition during shutdown
 Key: SPARK-47220
 URL: https://issues.apache.org/jira/browse/SPARK-47220
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Holden Karau
Assignee: Holden Karau


There is a race condition during shutdown which can result in a few different 
errors:
 * ERROR Attempted to append to non-started appender
 *  ERROR Unable to write to stream

Since I've only seen it during stop() triggered within a shutdown hook I 
believe this is caused by the parallel execution of shutdown hooks (see 
[https://stackoverflow.com/questions/17400136/how-to-log-within-shutdown-hooks-with-log4j2]
 )



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[weewx-user] Re: wxt5x0 v0.7 driver issue

2024-02-27 Thread Stephen Holden
Thanks, just wanted to know if anyone else had issues so I could decide 
whether to spend some time looking at it now. Since you had no issues, I 
spent 20 minutes and checked everything and then re-installed the drive one 
more time after downloading and inspecting the driver code.  This time I 
got a prompt (y/n) during install that I'm 99% sure I didn't see before, 
not sure why.  
But after re-installing one more time it is working.

I hate those 'suddenly works, not sure why' issues, but don't have time to 
spend a day re-creating the exact instance (from weekly backup) to figure 
it out.

Thanks.

On Tuesday, February 27, 2024 at 4:55:53 p.m. UTC-5 vince wrote:

> Stephen - FWIW on v5 in a test I had no problems adding this driver or 
> seeing it available.  Did the commands in the github page and then "weectl 
> extension list" saw the driver/extension had been installed ok.  Tried the 
> reconfigure and it also appeared in the list (albeit in lower case near the 
> bottom of the list).
>
> On Tuesday, February 27, 2024 at 10:34:58 AM UTC-8 vince wrote:
>
>> No logs.   No transcript of what you did.  No info on your weewx version. 
>>  No info on whether you have a 'pip' or dpkg or setup installation.
>>
>> We can't read minds.  Need more info to even try to help.
>>
>> On Tuesday, February 27, 2024 at 9:53:23 AM UTC-8 Stephen Holden wrote:
>>
>>> I just updated my WXT5x0 driver to the latest version as per the README 
>>> on Matthew's github page, but now the wxt5x0 driver is not listed as an 
>>> option when I re-configure.
>>>
>>> Tried installing it again, but no luck. And the weewx.conf is unchanged 
>>> so points to the (older) wxt5x0 driver, which it can't find.
>>>
>>> Anyone else have issues updating the driver?
>>>
>>> Thanks in advance!
>>> S.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to weewx-user+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/9dcb7ca6-9d41-4f51-8d40-4b769c0b3f67n%40googlegroups.com.


[weewx-user] wxt5x0 v0.7 driver issue

2024-02-27 Thread Stephen Holden
I just updated my WXT5x0 driver to the latest version as per the README on 
Matthew's github page, but now the wxt5x0 driver is not listed as an option 
when I re-configure.

Tried installing it again, but no luck. And the weewx.conf is unchanged so 
points to the (older) wxt5x0 driver, which it can't find.

Anyone else have issues updating the driver?

Thanks in advance!
S.

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to weewx-user+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/46250096-9656-465d-8c3d-cf7007c6ee4an%40googlegroups.com.


Re: Generating config docs automatically

2024-02-21 Thread Holden Karau
I think this is a good idea. I like having everything in one source of
truth rather than two (so option 1 sounds like a good idea); but that’s
just my opinion. I'd be happy to help with reviews though.

On Wed, Feb 21, 2024 at 6:37 AM Nicholas Chammas 
wrote:

> I know config documentation is not the most exciting thing. If there is
> anything I can do to make this as easy as possible for a committer to
> shepherd, I’m all ears!
>
>
> On Feb 14, 2024, at 8:53 PM, Nicholas Chammas 
> wrote:
>
> I’m interested in automating our config documentation and need input from
> a committer who is interested in shepherding this work.
>
> We have around 60 tables of configs across our documentation. Here’s a
> typical example.
> 
>
> These tables span several thousand lines of manually maintained HTML,
> which poses a few problems:
>
>- The documentation for a given config is sometimes out of sync across
>the HTML table and its source `ConfigEntry`.
>- Internal configs that are not supposed to be documented publicly
>sometimes are.
>- Many config names and defaults are extremely long, posing formatting
>problems.
>
>
> Contributors waste time dealing with these issues in a losing battle to
> keep everything up-to-date and consistent.
>
> I’d like to solve all these problems by generating HTML tables
> automatically from the `ConfigEntry` instances where the configs are
> defined.
>
> I’ve proposed two alternative solutions:
>
>- #44755 : Enhance
>`ConfigEntry` so a config can be associated with one or more groups, and
>use that new metadata to generate the tables we need.
>- #44756 : Add a
>standalone YAML file where we define config groups, and use that to
>generate the tables we need.
>
>
> If you’re a committer and are interested in this problem, please chime in
> on whatever approach appeals to you. If you think this is a bad idea, I’m
> also eager to hear your feedback.
>
> Nick
>
>


Re: Spark 4.0 Query Analyzer Bug Report

2024-02-20 Thread Holden Karau
Do you mean Spark 3.4? 4.0 is very much not released yet.

Also it would help if you could share your query & more of the logs leading
up to the error.

On Tue, Feb 20, 2024 at 3:07 PM Sharma, Anup 
wrote:

> Hi Spark team,
>
>
>
> We ran into a dataframe issue after upgrading from spark 3.1 to 4.
>
>
>
> query_result.explain(extended=True)\n  File
> \"…/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py\"
>
> raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while 
> calling z:org.apache.spark.sql.api.python.PythonSQLUtils.explainString.\n: 
> java.lang.IllegalStateException: You hit a query analyzer bug. Please report 
> your query to Spark user mailing list.\n\tat 
> org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:516)\n\tat
>  
> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)\n\tat
>  scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)\n\tat 
> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)\n\tat 
> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)\n\tat 
> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)\n\tat
>  
> org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)\n\tat
>  
> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)\n\tat
>  
> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)\n\tat
>  
> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)\n\tat
>  scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat 
> scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1431)\n\tat 
> scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)\n\tat 
> scala.collect...
>
>
>
>
>
> Could you please let us know if this is already being looked at?
>
>
>
> Thanks,
>
> Anup
>


-- 
Cell : 425-233-8271


[jira] [Resolved] (SPARK-47077) sbt build is broken due to selenium change

2024-02-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau resolved SPARK-47077.
--
Resolution: Cannot Reproduce

After blowing away my maven + ivy cache it works fine – should have done that 
earlier.

> sbt build is broken due to selenium change
> --
>
> Key: SPARK-47077
> URL: https://issues.apache.org/jira/browse/SPARK-47077
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 4.0.0, 3.5.2
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>  Labels: pull-request-available
>
> Building with sbt & JDK11 or 17 (executed after reload & clean 
> ;compile;catalyst/testOnly 
> org.apache.spark.sql.catalyst.optimizer.FilterPushdownSuite) results in
>  
> {code:java}
>  
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:20:8:
>  object WebDriver is not a member of package org.openqa.selenium
> [error] import org.openqa.selenium.WebDriver
> [error]        ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:33:27:
>  not found: type WebDriver
> [error]   override var webDriver: WebDriver = _
> [error]                           ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:37:29:
>  Class org.openqa.selenium.remote.AbstractDriverOptions not found - 
> continuing with a stub.
> [error]     val chromeOptions = new ChromeOptions
> [error]                             ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/RealBrowserUIHistoryServerSuite.scala:24:8:
>  object WebDriver is not a member of package org.openqa.selenium
> [error] import org.openqa.selenium.WebDriver
> [error]        ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/RealBrowserUIHistoryServerSuite.scala:43:27:
>  not found: type WebDriver
> [error]   implicit var webDriver: WebDriver
> [error]                           ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:39:21:
>  Class org.openqa.selenium.remote.RemoteWebDriver not found - continuing with 
> a stub.
> [error]     webDriver = new ChromeDriver(chromeOptions)
> [error]                     ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:20:28:
>  Unused import
> [error] Applicable -Wconf / @nowarn filters for this fatal warning: msg= of the message>, cat=unused-imports, site=org.apache.spark.deploy.history
> [error] import org.openqa.selenium.WebDriver
> [error]                            ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:36:8:
>  object WebDriver is not a member of package org.openqa.selenium
> [error] import org.openqa.selenium.WebDriver
> [error]        ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:383:29:
>  not found: type WebDriver
> [error]     implicit val webDriver: WebDriver = new HtmlUnitDriver
> [error]                             ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:37:8:
>  Class org.openqa.selenium.WebDriver not found - continuing with a stub.
> [error] import org.openqa.selenium.htmlunit.HtmlUnitDriver
> [error]        ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:383:45:
>  Class org.openqa.selenium.Capabilities not found - continuing with a stub.
> [error]     implicit val webDriver: WebDriver = new HtmlUnitDriver
> [error]                                             ^
> [error] 
> /home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:470:9:
>  Symbol 'type org.openqa.selenium.WebDriver' is missing from the classpath.
> [error] This symbol is required by 'value 
> org.scalatestplus.selenium.WebBrowser.go.driver'.
> [error] Make sure that type WebDriver is in your classpath and check for 
> conflicting dependencies with `-Ylog-classpath`.
> [error] A full rebuild may help if 'WebBrowser.class' was compiled against an 
> incompatible version of org.openqa.selenium.
> [error] 

[jira] [Created] (SPARK-47077) sbt build is broken due to selenium change

2024-02-16 Thread Holden Karau (Jira)
Holden Karau created SPARK-47077:


 Summary: sbt build is broken due to selenium change
 Key: SPARK-47077
 URL: https://issues.apache.org/jira/browse/SPARK-47077
 Project: Spark
  Issue Type: Improvement
  Components: Build, Tests
Affects Versions: 4.0.0, 3.5.2
Reporter: Holden Karau
Assignee: Holden Karau


Building with sbt & JDK11 or 17 (executed after reload & clean 
;compile;catalyst/testOnly 
org.apache.spark.sql.catalyst.optimizer.FilterPushdownSuite) results in

 
{code:java}
 
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:20:8:
 object WebDriver is not a member of package org.openqa.selenium
[error] import org.openqa.selenium.WebDriver
[error]        ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:33:27:
 not found: type WebDriver
[error]   override var webDriver: WebDriver = _
[error]                           ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:37:29:
 Class org.openqa.selenium.remote.AbstractDriverOptions not found - continuing 
with a stub.
[error]     val chromeOptions = new ChromeOptions
[error]                             ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/RealBrowserUIHistoryServerSuite.scala:24:8:
 object WebDriver is not a member of package org.openqa.selenium
[error] import org.openqa.selenium.WebDriver
[error]        ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/RealBrowserUIHistoryServerSuite.scala:43:27:
 not found: type WebDriver
[error]   implicit var webDriver: WebDriver
[error]                           ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:39:21:
 Class org.openqa.selenium.remote.RemoteWebDriver not found - continuing with a 
stub.
[error]     webDriver = new ChromeDriver(chromeOptions)
[error]                     ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/ChromeUIHistoryServerSuite.scala:20:28:
 Unused import
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=unused-imports, site=org.apache.spark.deploy.history
[error] import org.openqa.selenium.WebDriver
[error]                            ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:36:8:
 object WebDriver is not a member of package org.openqa.selenium
[error] import org.openqa.selenium.WebDriver
[error]        ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:383:29:
 not found: type WebDriver
[error]     implicit val webDriver: WebDriver = new HtmlUnitDriver
[error]                             ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:37:8:
 Class org.openqa.selenium.WebDriver not found - continuing with a stub.
[error] import org.openqa.selenium.htmlunit.HtmlUnitDriver
[error]        ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:383:45:
 Class org.openqa.selenium.Capabilities not found - continuing with a stub.
[error]     implicit val webDriver: WebDriver = new HtmlUnitDriver
[error]                                             ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:470:9:
 Symbol 'type org.openqa.selenium.WebDriver' is missing from the classpath.
[error] This symbol is required by 'value 
org.scalatestplus.selenium.WebBrowser.go.driver'.
[error] Make sure that type WebDriver is in your classpath and check for 
conflicting dependencies with `-Ylog-classpath`.
[error] A full rebuild may help if 'WebBrowser.class' was compiled against an 
incompatible version of org.openqa.selenium.
[error]         go to target.toExternalForm
[error]         ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:470:12:
 could not find implicit value for parameter driver: 
org.openqa.selenium.WebDriver
[error]         go to target.toExternalForm
[error]            ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:36:28:
 Unused import
[error] Applicable -Wconf / @nowarn filters for this fatal warning: msg=, cat=unused-imports, site=org.apache.spark.deploy.history
[error] import org.openqa.selenium.WebDriver
[error]                            ^
[error] 
/home/holden/repos/spark/core/src/test/scala/org/apache/spark/deploy/history/RealBrowserUIHistoryServerSuite.scala:24:28:

[jira] [Updated] (SPARK-47001) Pushdown Verification in Optimizer.scala should support changed data types

2024-02-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-47001:
-
Description: When pushing a filter down in a union the data type may not 
match exactly if the filter was constructed using the child dataframe 
reference. This is because the unions output is updated with a structype merge 
of union which can turn non-nullable to nullable. These are still the same 
column despite the different nullability so the filter should be safe to push 
down. As it currently stands we get an exception.  (was: Right now it asserts 
exact equality but uses semanticEquality for candidacy, this can result in an 
unexpected exception in Optimizer.scala when pushing down semantically equal 
but different values.)
Summary: Pushdown Verification in Optimizer.scala should support 
changed data types  (was: Pushdown Verification in Optimizer.scala should use 
semantic equals)

> Pushdown Verification in Optimizer.scala should support changed data types
> --
>
> Key: SPARK-47001
> URL: https://issues.apache.org/jira/browse/SPARK-47001
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> When pushing a filter down in a union the data type may not match exactly if 
> the filter was constructed using the child dataframe reference. This is 
> because the unions output is updated with a structype merge of union which 
> can turn non-nullable to nullable. These are still the same column despite 
> the different nullability so the filter should be safe to push down. As it 
> currently stands we get an exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Dynamically Support Spark Native Engine in Iceberg

2024-02-13 Thread Holden Karau
This is great work! Very excited to see this.

Cell : 425-233-8271


On Tue, Feb 13, 2024 at 4:38 PM huaxin gao  wrote:

> Hello Iceberg community,
>
> As you may already know, Project Comet
> , a plugin to
> accelerate Spark query execution via leveraging DataFusion and Arrow, has
> been open sourced under the Apache Arrow umbrella. To capitalize on the
> capabilities of Project Comet, I propose the implementation of a dynamic
> plugin mechanism. This mechanism will enable seamless integration with not
> only Project Comet but also other native execution engines.
>
> I have prepared a spec
> 
> and also submitted a corresponding PR
> . Your feedback on this
> proposal would be greatly appreciated.
>
> Thanks,
> Huaxin
>
>
>
>


Re: Introducing Comet, a plugin to accelerate Spark execution via DataFusion and Arrow

2024-02-13 Thread Holden Karau
This looks really cool :) Out of interest what are the differences in the
approach between this and Glutten?

On Tue, Feb 13, 2024 at 12:42 PM Chao Sun  wrote:

> Hi all,
>
> We are very happy to announce that Project Comet, a plugin to
> accelerate Spark query execution via leveraging DataFusion and Arrow,
> has now been open sourced under the Apache Arrow umbrella. Please
> check the project repo
> https://github.com/apache/arrow-datafusion-comet for more details if
> you are interested. We'd love to collaborate with people from the open
> source community who share similar goals.
>
> Thanks,
> Chao
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: Introducing Comet, a plugin to accelerate Spark execution via DataFusion and Arrow

2024-02-13 Thread Holden Karau
This looks really cool :) Out of interest what are the differences in the
approach between this and Glutten?

On Tue, Feb 13, 2024 at 12:42 PM Chao Sun  wrote:

> Hi all,
>
> We are very happy to announce that Project Comet, a plugin to
> accelerate Spark query execution via leveraging DataFusion and Arrow,
> has now been open sourced under the Apache Arrow umbrella. Please
> check the project repo
> https://github.com/apache/arrow-datafusion-comet for more details if
> you are interested. We'd love to collaborate with people from the open
> source community who share similar goals.
>
> Thanks,
> Chao
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


[jira] [Created] (SPARK-47031) Union of with non-determinstic expression should be non-deterministic

2024-02-12 Thread Holden Karau (Jira)
Holden Karau created SPARK-47031:


 Summary: Union of with non-determinstic expression should be 
non-deterministic
 Key: SPARK-47031
 URL: https://issues.apache.org/jira/browse/SPARK-47031
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.0
Reporter: Holden Karau


We have special case handling for nullability already where any expression 
which is unioned with a nullable field becomes nullable, but we should do the 
same for deterministic.

 

I found this while I was poking around with push downs.

 

I believe the code to be updated would be output in the union case class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Question about OpenCL VLOOKUP implementation

2024-01-23 Thread Dhiraj Holden
I have a question about the OpenCL implementation of VLOOKUP as given in
sc/source/core/opencl/op_spreadsheet.cxx. Right now, both the unsorted and
sorted vlookup both do a linear search to find the right value. I am
wondering if it would be better to do a binary search for sorted vlookup. I
could take care of that now that I've wrapped my head around this
implementation.

Thanks,
Dhiraj


Re: [Spark-Core] Improving Reliability of spark when Executors OOM

2024-01-16 Thread Holden Karau
Oh interesting solution, a co-worker was suggesting something similar using
resource profiles to increase memory -- but your approach avoids a lot of
complexity I like it (and we could extend it out to support resource
profile growth too).

I think an SPIP sounds like a great next step.

On Tue, Jan 16, 2024 at 10:46 PM kalyan  wrote:

> Hello All,
>
> At Uber, we had recently, done some work on improving the reliability of
> spark applications in scenarios of fatter executors going out of memory and
> leading to application failure. Fatter executors are those that have more
> than 1 task running on it at a given time concurrently. This has
> significantly improved the reliability of many spark applications for us at
> Uber. We made a blog about this recently. Link:
> https://www.uber.com/en-US/blog/dynamic-executor-core-resizing-in-spark/
>
> At a high level, we have done the below changes:
>
>1. When a Task fails with the OOM of an executor, we update the core
>requirements of the task to max executor cores.
>2. When the task is picked for rescheduling, the new attempt of the
>task happens to be on an executor where no other task can run concurrently.
>All cores get allocated to this task itself.
>3. This way we ensure that the configured memory is completely at the
>disposal of a single task. Thus eliminating contention of memory.
>
> The best part of this solution is that it's reactive. It kicks in only
> when the executors fail with the OOM exception.
>
> We understand that the problem statement is very common and we expect our
> solution to be effective in many cases.
>
> There could be more cases that can be covered. Executor failing with OOM
> is like a hard signal. The framework(making the driver aware of
> what's happening with the executor) can be extended to handle scenarios of
> other forms of memory pressure like excessive spilling to disk, etc.
>
> While we had developed this on Spark 2.4.3 in-house, we would like to
> collaborate and contribute this work to the latest versions of Spark.
>
> What is the best way forward here? Will an SPIP proposal to detail the
> changes help?
>
> Regards,
> Kalyan.
> Uber India.
>


-- 
Cell : 425-233-8271


Re: [BVARC] Solar effect

2024-01-05 Thread David Holden via BVARC
Space weather is a complicated subject. Solar flares emit both photons and protons. The photons are part of the electromagnetic radiation which reaches earth at the speed of light. The protons ejected with or by the flare are mass at very high speeds and can reach earth in as little as 30 minutes. Flares are often accompanied by CME’s which, as the name suggests, are mass ejections and reaches earth typically in a day or two. Solar storms are the high speed protons. CME’s also disrupt radio. There is enough information on space weather online to make almost anyone’s eyes glaze over if you want more. - David WJ9O On Jan 5, 2024, at 11:15 AM, Mike Knerr via BVARC  wrote:I understood that a coronal mass ejection released protons, not photons. These also bring high electromagnetic fields with them.I understood the electromagnetic fields are causing the problems. Just a thought. Mike Knerr KI5UBL 73On Fri, Jan 5, 2024, 10:03 AM Stephen Flowers via BVARC <bvarc@bvarc.org> wrote:Richard,Good morning and Good question.  The way I understand it, solar flares emit a large amount of photons at various frequencies.  If these photons are sufficiently energetic, then they will pass a large portion of the ionosphere and impact what we refer to as the D layer.  Note that the D layer is a daytime ionospheric layer that according to some models in the literature is made up of NO+, NO+(H2O)n, H+(H2O)n, CO3−, and O3− These species readily combine with free electrons that increase in numbers due to the flares.  This in turn results in “less refraction" of E waves that we as amateurs need to bounce our signals off of to communicate.  In a nutshell, D layer constituents don’t refract as much and this is interpreted as “D layer absorption”.  If you look at this URL you can playback a movie of the latest D layer absorption measurements.  During a solar storm you’ll see the bar graph in the right hand corner increases across multiple frequencies.Note that in a solar flare event the lower frequencies are preferentially impacted.Of course other layers in the ionosphere are also affected by solar flares in ˜8 minutes of emission and CMEs hit us w/a delayed impact of ˜1 day or so; however, you may be on to something when you say that the lower frequencies suffer a bigger impact.  In that case, you may be right in that the higher frequencies, 20m and higher, may be the way to go for Ham radio ops during solar storms.Thank you for bringing up this topic!73,Stephen (W2WF)On Jan 5, 2024, at 9:03 AM, David Holden via BVARC <bvarc@bvarc.org> wrote:A strong solar storm can cause a complete blackout of HF communication including the higher frequency bands. I was in a QSO a year or so ago and it just dropped as a solar storm hit. The noise floor dropped to zero as not even noise could propagate through the highly energized atmosphere. Lesser solar storms can increase noise particularly on the lower bands so 80 might be unusable while 20 might just be noisy. David WJ9O On Jan 4, 2024, at 10:25 PM, Richard Bonica via BVARC <bvarc@bvarc.org> wrote:To all,Tell me if I am wrong on this. During these solar storms, it is my understanding to use the higher frequency rather than lower? If so, is 20 and 40m bands a good choice?Thank you in advanceRichardKG5YCU Brazos Valley Amateur Radio ClubBVARC mailing listBVARC@bvarc.orghttp://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.orgPublicly available archives are available here: https://www.mail-archive.com/bvarc@bvarc.org/Brazos Valley Amateur Radio ClubBVARC mailing listBVARC@bvarc.orghttp://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.orgPublicly available archives are available here: https://www.mail-archive.com/bvarc@bvarc.org/ 
Brazos Valley Amateur Radio Club

BVARC mailing list
BVARC@bvarc.org
http://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.org
Publicly available archives are available here: https://www.mail-archive.com/bvarc@bvarc.org/ 

Brazos Valley Amateur Radio ClubBVARC mailing listBVARC@bvarc.orghttp://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.orgPublicly available archives are available here: https://www.mail-archive.com/bvarc@bvarc.org/ 
Brazos Valley Amateur Radio Club

BVARC mailing list
BVARC@bvarc.org
http://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.org
Publicly available archives are available here: 
https://www.mail-archive.com/bvarc@bvarc.org/ 


Re: [BVARC] Solar effect

2024-01-05 Thread David Holden via BVARC
A strong solar storm can cause a complete blackout of HF communication 
including the higher frequency bands. I was in a QSO a year or so ago and it 
just dropped as a solar storm hit. The noise floor dropped to zero as not even 
noise could propagate through the highly energized atmosphere. 

Lesser solar storms can increase noise particularly on the lower bands so 80 
might be unusable while 20 might just be noisy. 

David WJ9O 


> On Jan 4, 2024, at 10:25 PM, Richard Bonica via BVARC  wrote:
> 
> 
> To all,
> Tell me if I am wrong on this. During these solar storms, it is my 
> understanding to use the higher frequency rather than lower? If so, is 20 and 
> 40m bands a good choice?
> Thank you in advance
> Richard
> KG5YCU 
> 
> Brazos Valley Amateur Radio Club
> 
> BVARC mailing list
> BVARC@bvarc.org
> http://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.org
> Publicly available archives are available here: 
> https://www.mail-archive.com/bvarc@bvarc.org/



Brazos Valley Amateur Radio Club

BVARC mailing list
BVARC@bvarc.org
http://mail.bvarc.org/mailman/listinfo/bvarc_bvarc.org
Publicly available archives are available here: 
https://www.mail-archive.com/bvarc@bvarc.org/ 


[LincolnTalk] Stucco

2023-12-26 Thread sarah cannon holden
Good Morning,

Perhaps I do not understand the archives but I don't see how I can search
for a STUCCO person who can repair some damaged outside walls on my house.
I know there have been suggestions in the past.  So... if anyone has a
suggestion for a Stucco repair person and/or can tell me how I search the
archives  for a specific item I would be most grateful.

Thank you very much.  Sarah
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



(spark) branch master updated: [SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to ViewCatalog

2023-12-05 Thread holden
This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 89673da0beb2 [SPARK-45807][SQL] Add createOrReplaceView(..) / 
replaceView(..) to ViewCatalog
89673da0beb2 is described below

commit 89673da0beb2b64434d29a94e07fa9c6fb4a93e8
Author: Eduard Tudenhoefner 
AuthorDate: Tue Dec 5 10:29:32 2023 -0800

[SPARK-45807][SQL] Add createOrReplaceView(..) / replaceView(..) to 
ViewCatalog

### What changes were proposed in this pull request?

ViewCatalog API improvements described in 
[SPIP](https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing)
 that didn't make it into the codebase as part of #37556

### Why are the changes needed?

Required for DataSourceV2 view support.

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

N/A

Closes #43677 from nastra/SPARK-45807.

Authored-by: Eduard Tudenhoefner 
Signed-off-by: Holden Karau 
---
 .../spark/sql/connector/catalog/ViewCatalog.java   | 86 ++
 1 file changed, 86 insertions(+)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java
index eb67b9904869..eef348928b2c 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java
@@ -140,6 +140,92 @@ public interface ViewCatalog extends CatalogPlugin {
   String[] columnComments,
   Map properties) throws ViewAlreadyExistsException, 
NoSuchNamespaceException;
 
+  /**
+   * Replace a view in the catalog.
+   * 
+   * The default implementation has a race condition.
+   * Catalogs are encouraged to implement this operation atomically.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param queryColumnNames the query column names
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @throws NoSuchViewException If the view doesn't exist or is a table
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  default void replaceView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] queryColumnNames,
+  String[] columnAliases,
+  String[] columnComments,
+  Map properties) throws NoSuchViewException, 
NoSuchNamespaceException {
+if (viewExists(ident)) {
+  dropView(ident);
+  try {
+createView(ident, sql, currentCatalog, currentNamespace, schema,
+queryColumnNames, columnAliases, columnComments, properties);
+  }
+  catch (ViewAlreadyExistsException e) {
+throw new RuntimeException("Race condition when dropping and creating 
view", e);
+  }
+} else {
+  throw new NoSuchViewException(ident);
+}
+  }
+
+  /**
+   * Create or replace a view in the catalog.
+   * 
+   * The default implementation has race conditions.
+   * Catalogs are encouraged to implement this operation atomically.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param queryColumnNames the query column names
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  default void createOrReplaceView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] queryColumnNames,
+  String[] columnAliases,
+  String[] columnComments,
+  Map properties) throws NoSuchNamespaceException {
+if (viewExists(ident)) {
+  try {
+replaceView(ident, sql, currentCatalog, currentNamespace, schema,
+queryColumnNames, columnAliases, columnComments, properties);
+  } catch (NoSuchViewException e) {
+throw new RuntimeException("Race conditio

Re: Spark-Connect: Param `--packages` does not take effect for executors.

2023-12-04 Thread Holden Karau
So I think this sounds like a bug to me, in the help options for both
regular spark-submit and ./sbin/start-connect-server.sh we say:
"  --packages  Comma-separated list of maven coordinates of
jars to include
  on the driver and executor classpaths. Will
search the local
  maven repo, then maven central and any
additional remote
  repositories given by --repositories. The
format for the
  coordinates should be
groupId:artifactId:version."

If the behaviour is intentional for spark-connect it would be good to
understand why (and then also update the docs).

On Mon, Dec 4, 2023 at 3:33 PM Aironman DirtDiver 
wrote:

> The issue you're encountering with the iceberg-spark-runtime dependency
> not being properly passed to the executors in your Spark Connect server
> deployment could be due to a couple of factors:
>
>1.
>
>*Spark Submit Packaging:* When you use the --packages parameter in
>spark-submit, it only adds the JARs to the driver classpath. The
>executors still need to download and load the JARs separately. This
>can lead to issues if the JARs are not accessible from the executors, such
>as when running in a distributed environment like Kubernetes.
>2.
>
>*Kubernetes Container Image:* The Spark Connect server container image
>(xxx/spark-py:3.5-prd) might not have the iceberg-spark-runtime dependency
>pre-installed. This means that even if the JARs are available on the
>driver,the executors won't have access to them.
>
> To address this issue, consider the following solutions:
>
>1.
>
>*Package Dependencies into Image:* As you mentioned, packaging the
>required dependencies into your container image is a viable option. This
>ensures that the executors have direct access to the JARs, eliminating
>the need for downloading or copying during job execution.
>2.
>
>*Use Spark Submit with --jars Option:* Instead of relying on --packages
>, you can explicitly specify the JARs using the --jars option in
>spark-submit. This will package the JARs into the Spark application's
>submission directory, ensuring that they are available to both the
>driver and executors.
>3.
>
>*Mount JARs as Shared Volume:* If the iceberg-spark-runtime dependency
>is already installed on the cluster nodes,you can mount the JARs as a
>shared volume accessible to both the driver and executors. This avoids
>the need to package or download the JARs.
>Mounting JARs as a shared volume in your Spark Connect server
>deployment involves creating a shared volume that stores the JARs and then
>mounting that volume to both the driver and executor containers. Here's a
>step-by-step guide:
>
>Create a Shared Volume: Create a shared volume using a persistent
>storage solution like NFS, GlusterFS, or AWS EFS. Ensure that all cluster
>nodes have access to the shared volume.
>
>Copy JARs to Shared Volume: Copy the required JARs, including
>iceberg-spark-runtime, to the shared volume. This will make them accessible
>to both the driver and executor containers.
>
>Mount Shared Volume to Driver Container: In your Spark Connect server
>deployment configuration, specify the shared volume as a mount point for
>the driver container. This will make the JARs available to the driver.
>
>Mount Shared Volume to Executor Containers: In the Spark Connect
>server deployment configuration, specify the shared volume as a mount point
>for the executor containers. This will make the JARs available to the
>executors.
>
>Update Spark Connect Server Configuration: In your Spark Connect
>server configuration, ensure that the spark.sql.catalogImplementation
>property is set to iceberg. This will instruct Spark to use the Iceberg
>catalog implementation.
>
>By following these steps, you can successfully mount JARs as a shared
>volume in your Spark Connect server deployment, eliminating the need to
>package or download the JARs.
>4.
>
>*Use Spark Connect Server with Remote Resources:* Spark Connect Server
>supports accessing remote resources,such as JARs stored in a distributed
>file system or a cloud storage service. By configuring Spark Connect
>Server to use remote resources, you can avoid packaging the
>dependencies into the container image.
>
> By implementing one of these solutions, you should be able to resolve the
> issue of the iceberg-spark-runtime dependency not being properly passed to
> the executors in your Spark Connect server deployment.
>
> Let me know if any of the proposal works for you.
>
> Alonso
>
> El lun, 4 dic 2023 a las 11:44, Xiaolong Wang
> () escribió:
>
>> Hi, Spark community,
>>
>> I encountered a weird bug when using Spark Connect server to integrate
>> with Iceberg. I added the 

Re: [LincolnTalk] Possible to vote early today?

2023-12-02 Thread Sarah Cannon Holden
When you check in this am you will be given two ballots. One is for Community 
Center ~ Article 2. The other is for Housing Choice Act ~ Article 4. Voting 
will take place for each article after the presentation and discussions of each 
article. There is no early voting. 

Sarah Cannon Holden
Town Moderator

> On Dec 2, 2023, at 7:06 AM, Pat Gray  wrote:
> 
> Can someone confirm if a resident can register this morning and at the same 
> time get a ballot and vote? I hope so, because it would allow many people to 
> have their voices heard.
> Pat Gray 
> 
> Sent from my iPhone
> -- 
> The LincolnTalk mailing list.
> To post, send mail to Lincoln@lincolntalk.org.
> Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
> Change your subscription settings at 
> https://pairlist9.pair.net/mailman/listinfo/lincoln.
> 
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[LincolnTalk] Fwd: Your request for Reasonable Accommodation at Special Town Meeting Re: Voting on Saturday

2023-12-01 Thread Sarah Cannon Holden
Please see the note below from Tim Higgins. Our Town Counsel had advised that remote voting is precluded under State Law. Thank you. See you tomorrow. Sarah Cannon HoldenBegin forwarded message:From: "Higgins, Timothy S." Date: December 1, 2023 at 11:41:53 AM ESTTo: Sarah Liepert , Jane Marie , bethany.h.br...@mass.govCc: "Pereira, Dan" , "Hutchinson, Jim" , "Cannon Holden, Sarah" , "Pereira, Dan" Subject: RE: Your request for Reasonable Accommodation at Special Town Meeting Re: [LincolnTalk] Voting on Saturday







Hello everyone –
 
We have been in discussion with the Attorney General’s Office and have shared Town Counsel’s opinion.  We expect to have ongoing dialogue with them, and with the Mass Office of Disability next week.  In the meantime, we have been reaching
 out to anyone who has expressed a particular challenge to explore possible accommodations, over and above those we currently offer, short of providing remote voting which Town Counsel has advised is precluded under state law.
 
Tim Higgins
 
 
Timothy S. Higgins
Town Administrator
Town of Lincoln
16 Lincoln Road
Lincoln, MA 01773
 
higgi...@lincolntown.org
781 259 -2601
 
 
 


From: Sarah Liepert  
Sent: Friday, December 1, 2023 10:58 AM
To: Jane Marie ; bethany.h.br...@mass.gov
Cc: Pereira, Dan ; Hutchinson, Jim ; Cannon Holden, Sarah ; Higgins, Timothy S. 
Subject: Re: Your request for Reasonable Accommodation at Special Town Meeting Re: [LincolnTalk] Voting on Saturday


 
Dear Jane Marie, 

 


Please get in touch with Bethany Brown, ADA Coordinator at the Massachusetts Attorney General’s Office as soon as possible. Her cell is 781-429-9286. She is copied on this email.


 


Importantly, the Massachusetts AGO (Attorney General’s Office) is not in agreement with the Lincoln Town Counsel’s advisory opinion. 


 


You may document your concerns regarding ADA Reasonable Accommodations with Bethany Brown.


 


All the best,


 


Sarah Liepert


Trapelo Rd., Lincoln







On Dec 1, 2023, at 10:37 AM, Higgins, Timothy S. <higgi...@lincolntown.org> wrote:




 
Hello Sarah and Jane Marie 
 
The Town’s preparations for Town Meeting include a whole host of accommodation measures.  Remote voting is not one of them as Town Counsel has advised us that remote voting is precluded by State Law.  Below please find her formal opinion. 
  We are, however, live streaming the meeting so that folks at home will be able to observe and listen.  The instructions for tuning in are included on the Town’s website.
 
Thank you for your question.
 
Tim Higgins
 
 
Timothy S. Higgins
Town Administrator
Town of Lincoln
16 Lincoln Road
Lincoln, MA 01773
 
higgi...@lincolntown.org
781 259 -2601
 
 
 
 
Tim, this is to follow up on my initial response to you concerning reasonable accommodations at town meetings.  We understand that the Massachusetts Office on Disability has taken
 the position that remote participation in a town meeting is a reasonable accommodation for a person who cannot attend.  In our opinion, allowing remote participation in an open town meeting is specifically prohibited by law and would fundamentally change both
 the nature and the conduct of a town meeting.  
 
As you are likely already aware, the Americans with Disabilities Act (“ADA”) requires public entities to make “reasonable modifications” to their procedures to accommodate persons
 with disabilities.  What is “reasonable” is fact-specific and depends upon the nature of the program and the accommodation being sought.  However, any change that would result in a “fundamental alteration” to the program or service is not required.  A fundamental
 alteration is one that results in a change in the essential nature of the service or program.  Likewise, a requested accommodation is not required if it would result in undue financial and administrative burdens.
 
The very purpose of a town meeting is for members of the community to gather together to debate and vote of legislative issues of the Town.  Allowing some individuals to participate
 from a remote location fundamentally changes the public, legislative process. Moreover, such action is specifically not allowed by law.  During the COVID-era revisions to various municipal laws, representative town meetings were specifically authorized to
 meet remotely.  That authority was extended several times.  During that same period, the General Court considered whether remote participation should be allowed at open town meetings; such a concept never received significant support, however, and was not
 enacted.  Moreover, allowing one or a small group of people to participate remotely would result in undue financial and administrative burdens to the Town.  Not only would such a system be difficult and costly to implement, it would be highly disruptive during
 the course of the meeting and would require significant adjustments in the procedures that are usually followed.  Again, this p

Help need to recover my account

2023-11-27 Thread Leslie Holden
Now, my OpenOffice account is gone.  How can I restore it?  (I got my
> computer back from repair and I lost a lot of my programs ... like Open
> Office. HELP

Re: Classpath isolation per SparkSession without Spark Connect

2023-11-27 Thread Holden Karau
So I don’t think we make any particular guarantees around class path
isolation there, so even if it does work it’s something you’d need to pay
attention to on upgrades. Class path isolation is tricky to get right.

On Mon, Nov 27, 2023 at 2:58 PM Faiz Halde  wrote:

> Hello,
>
> We are using spark 3.5.0 and were wondering if the following is achievable
> using spark-core
>
> Our use case involves spinning up a spark cluster where the driver
> application loads user jars containing spark transformations at runtime. A
> single spark application can load multiple user jars ( same cluster ) that
> can have class path conflicts if care is not taken
>
> AFAIK, to get this right requires the Executor to be designed in a way
> that allows for class path isolation ( UDF, lambda expressions ). Ideally
> per Spark Session is what we want
>
> I know Spark connect has been designed this way but Spark connect is not
> an option for us at the moment. I had some luck using a private method
> inside spark called JobArtifactSet.withActiveJobArtifactState
>
> Is it sufficient for me to run the user code enclosed
> within JobArtifactSet.withActiveJobArtifactState to achieve my requirement?
>
> Thank you
>
>
> Faiz
>


Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-14 Thread Holden Karau
+1

On Tue, Nov 14, 2023 at 10:21 AM DB Tsai  wrote:

> +1
>
> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>
> On Nov 14, 2023, at 10:14 AM, Vakaris Baškirov <
> vakaris.bashki...@gmail.com> wrote:
>
> +1 (non-binding)
>
>
> On Tue, Nov 14, 2023 at 8:03 PM Chao Sun  wrote:
>
>> +1
>>
>> On Tue, Nov 14, 2023 at 9:52 AM L. C. Hsieh  wrote:
>> >
>> > +1
>> >
>> > On Tue, Nov 14, 2023 at 9:46 AM Ye Zhou  wrote:
>> > >
>> > > +1(Non-binding)
>> > >
>> > > On Tue, Nov 14, 2023 at 9:42 AM L. C. Hsieh  wrote:
>> > >>
>> > >> Hi all,
>> > >>
>> > >> I’d like to start a vote for SPIP: An Official Kubernetes Operator
>> for
>> > >> Apache Spark.
>> > >>
>> > >> The proposal is to develop an official Java-based Kubernetes operator
>> > >> for Apache Spark to automate the deployment and simplify the
>> lifecycle
>> > >> management and orchestration of Spark applications and Spark clusters
>> > >> on k8s at prod scale.
>> > >>
>> > >> This aims to reduce the learning curve and operation overhead for
>> > >> Spark users so they can concentrate on core Spark logic.
>> > >>
>> > >> Please also refer to:
>> > >>
>> > >>- Discussion thread:
>> > >> https://lists.apache.org/thread/wdy7jfhf7m8jy74p6s0npjfd15ym5rxz
>> > >>- JIRA ticket: https://issues.apache.org/jira/browse/SPARK-45923
>> > >>- SPIP doc:
>> https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE
>> > >>
>> > >>
>> > >> Please vote on the SPIP for the next 72 hours:
>> > >>
>> > >> [ ] +1: Accept the proposal as an official SPIP
>> > >> [ ] +0
>> > >> [ ] -1: I don’t think this is a good idea because …
>> > >>
>> > >>
>> > >> Thank you!
>> > >>
>> > >> Liang-Chi Hsieh
>> > >>
>> > >> -
>> > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> > >>
>> > >
>> > >
>> > > --
>> > >
>> > > Zhou, Ye  周晔
>> >
>> > -
>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>


[Python-Dev] Re: Retiring this mailing list ?

2023-11-14 Thread Steve Holden
On Mon, 13 Nov 2023 at 10:18, Marc-Andre Lemburg  wrote:

> [...]
>
> Question: Should we retire and archive this mailing list ?
> (I'm asking as one of the maintainers of the ML)
>
> [...]

Hi Marc-Andre,

Maybe just require senders to be members of the python.org domain, and
retain the release announcements?

Kind regards,
Steve

PS: Your mail triggered a visit to https://www.python.org/community/lists/
- it seems it could use some updates. For example,
comp.lang.python-announce is a news URL, which in this day and age will
baffle most visitors! At the very least the page could point to the
Discourse list.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SXLIZEV2Y6NHYYFWAMWL43JIIHR2AODD/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-12 Thread Holden Karau
To be clear: I am generally supportive of the idea (+1) but have some
follow-up questions:

Have we taken the time to learn from the other operators? Do we have a
compatible CRD/API or not (and if so why?)
The API seems to assume that everything is packaged in the container in
advance, but I imagine that might not be the case for many folks who have
Java or Python packages published to cloud storage and they want to use?
What's our plan for the testing on the potential version explosion (not
tying ourselves to operator version -> spark version makes a lot of sense,
but how do we reasonably assure ourselves that the cross product of
Operator Version, Kube Version, and Spark Version all function)? Do we have
CI resources for this?
Is there a current (non-open source operator) that folks from Apple are
using and planning to open source, or is this a fresh "from the ground up"
operator proposal?
One of the key reasons for this is listed as "An out-of-the-box automation
solution that scales effectively" but I don't see any discussion of the
target scale or plans to achieve it?



On Thu, Nov 9, 2023 at 9:02 PM Zhou Jiang  wrote:

> Hi Spark community,
>
> I'm reaching out to initiate a conversation about the possibility of
> developing a Java-based Kubernetes operator for Apache Spark. Following the
> operator pattern (
> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), Spark
> users may manage applications and related components seamlessly using
> native tools like kubectl. The primary goal is to simplify the Spark user
> experience on Kubernetes, minimizing the learning curve and operational
> complexities and therefore enable users to focus on the Spark application
> development.
>
> Although there are several open-source Spark on Kubernetes operators
> available, none of them are officially integrated into the Apache Spark
> project. As a result, these operators may lack active support and
> development for new features. Within this proposal, our aim is to introduce
> a Java-based Spark operator as an integral component of the Apache Spark
> project. This solution has been employed internally at Apple for multiple
> years, operating millions of executors in real production environments. The
> use of Java in this solution is intended to accommodate a wider user and
> contributor audience, especially those who are familiar with Scala.
>
> Ideally, this operator should have its dedicated repository, similar to
> Spark Connect Golang or Spark Docker, allowing it to maintain a loose
> connection with the Spark release cycle. This model is also followed by the
> Apache Flink Kubernetes operator.
>
> We believe that this project holds the potential to evolve into a thriving
> community project over the long run. A comparison can be drawn with the
> Flink Kubernetes Operator: Apple has open-sourced internal Flink Kubernetes
> operator, making it a part of the Apache Flink project (
> https://github.com/apache/flink-kubernetes-operator). This move has
> gained wide industry adoption and contributions from the community. In a
> mere year, the Flink operator has garnered more than 600 stars and has
> attracted contributions from over 80 contributors. This showcases the level
> of community interest and collaborative momentum that can be achieved in
> similar scenarios.
>
> More details can be found at SPIP doc : Spark Kubernetes Operator
> https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE
>
> Thanks,
> --
> *Zhou JIANG*
>
>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [LincolnTalk] Transfer Station

2023-11-11 Thread sarah cannon holden
Thank you everyone!  It's open.

On Sat, Nov 11, 2023 at 10:05 AM sarah cannon holden <
sarahcannonhol...@gmail.com> wrote:

> Is it open today?
>
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[LincolnTalk] Transfer Station

2023-11-11 Thread sarah cannon holden
Is it open today?
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread Holden Karau
+1

On Mon, Nov 6, 2023 at 4:30 PM yangjie01 
wrote:

> +1
>
>
>
> *发件人**: *Yuming Wang 
> *日期**: *2023年11月7日 星期二 07:00
> *收件人**: *Santosh Pingale 
> *抄送**: *Dongjoon Hyun , dev  >
> *主题**: *Re: Apache Spark 3.4.2 (?)
>
>
>
> +1
>
>
>
> On Tue, Nov 7, 2023 at 3:55 AM Santosh Pingale
>  wrote:
>
> Makes sense given the nature of those commits.
>
>
>
> On Mon, Nov 6, 2023, 7:52 PM Dongjoon Hyun 
> wrote:
>
> Hi, All.
>
> Apache Spark 3.4.1 tag was created on Jun 19th and `branch-3.4` has 103
> commits including important security and correctness patches like
> SPARK-44251, SPARK-44805, and SPARK-44940.
>
> https://github.com/apache/spark/releases/tag/v3.4.1
> 
>
> $ git log --oneline v3.4.1..HEAD | wc -l
> 103
>
> SPARK-44251 Potential for incorrect results or NPE when full outer
> USING join has null key value
> SPARK-44805 Data lost after union using
> spark.sql.parquet.enableNestedColumnVectorizedReader=true
> SPARK-44940 Improve performance of JSON parsing when
> "spark.sql.json.enablePartialResults" is enabled
>
> Currently, I'm checking the following open correctness issues. I'd like to
> propose to release Apache Spark 3.4.2 after resolving them and volunteer as
> the release manager for Apache Spark 3.4.2. If there are no additional
> blockers, the first tentative RC1 vote date is November 13rd (Monday). If
> it takes some time to resolve the open correctness issues, we can start the
> vote after Thanksgiving holiday.
>
> SPARK-44512 dataset.sort.select.write.partitionBy sorts wrong column
> SPARK-45282 Join loses records for cached datasets
>
> WDTY?
>
> Dongjoon.
>
>


[LincolnTalk] Photo Printers

2023-11-04 Thread sarah cannon holden
Does anyone have experience with either an Epson700 Sure Color or a Canon
Pro 300 printer?  Or if you want to suggest other printers, please do.  I
want to print 9x12 photo prints as well as note cards - that's my starting
point.

Thank you, Sarah
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[jira] [Created] (SPARK-45712) Provide a command line flag to override the log4j properties file

2023-10-27 Thread Holden Karau (Jira)
Holden Karau created SPARK-45712:


 Summary: Provide a command line flag to override the log4j 
properties file
 Key: SPARK-45712
 URL: https://issues.apache.org/jira/browse/SPARK-45712
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau


Override log4j properties is kind of annoying and depends on putting a file in 
the right place, we should let users specify which log4j properties file to use 
in spark-submit and friends.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[spark] branch master updated: [SPARK-44735][SQL] Add warning msg when inserting columns with the same name by row that don't match up

2023-10-23 Thread holden
This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ae100267c28b [SPARK-44735][SQL] Add warning msg when inserting columns 
with the same name by row that don't match up
ae100267c28b is described below

commit ae100267c28bc6fd2c2f9c880ed3df1999423992
Author: Jia Fan 
AuthorDate: Mon Oct 23 10:51:31 2023 -0700

[SPARK-44735][SQL] Add warning msg when inserting columns with the same 
name by row that don't match up

### What changes were proposed in this pull request?
This PR add a warning msg when inserting columns name with the same name by 
row but order not matched. Tell user can use `INSERT INTO BY NAME` to reorder 
columns to match with table schema.
It will be like:

![image](https://github.com/apache/spark/assets/32387433/18e57125-8a2e-407c-a3fd-93a9cbf122a1)

### Why are the changes needed?
Optimize user usage scenarios.

### Does this PR introduce _any_ user-facing change?
Yes, sometimes will show some warning.

### How was this patch tested?
Test in local

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42763 from Hisoka-X/SPARK-44735_add_warning_for_by_name.

Authored-by: Jia Fan 
Signed-off-by: Holden Karau 
---
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  2 ++
 .../sql/catalyst/analysis/TableOutputResolver.scala | 17 -
 .../apache/spark/sql/execution/datasources/rules.scala  |  5 -
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 0469fb29a6fc..06d949ece262 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3421,6 +3421,8 @@ class Analyzer(override val catalogManager: 
CatalogManager) extends RuleExecutor
   case v2Write: V2WriteCommand
   if v2Write.table.resolved && v2Write.query.resolved && 
!v2Write.outputResolved =>
 validateStoreAssignmentPolicy()
+TableOutputResolver.suitableForByNameCheck(v2Write.isByName,
+  expected = v2Write.table.output, queryOutput = v2Write.query.output)
 val projection = TableOutputResolver.resolveOutputColumns(
   v2Write.table.name, v2Write.table.output, v2Write.query, 
v2Write.isByName, conf)
 if (projection != v2Write.query) {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala
index d41757725771..1398552399cd 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala
@@ -19,7 +19,9 @@ package org.apache.spark.sql.catalyst.analysis
 
 import scala.collection.mutable
 
+import org.apache.spark.internal.Logging
 import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.SQLConfHelper
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project}
@@ -34,7 +36,7 @@ import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy
 import org.apache.spark.sql.types.{ArrayType, DataType, DecimalType, 
IntegralType, MapType, StructType}
 
-object TableOutputResolver {
+object TableOutputResolver extends SQLConfHelper with Logging {
 
   def resolveVariableOutputColumns(
   expected: Seq[VariableReference],
@@ -470,6 +472,19 @@ object TableOutputResolver {
 }
   }
 
+  def suitableForByNameCheck(
+  byName: Boolean,
+  expected: Seq[Attribute],
+  queryOutput: Seq[Attribute]): Unit = {
+if (!byName && expected.size == queryOutput.size &&
+  expected.forall(e => queryOutput.exists(p => conf.resolver(p.name, 
e.name))) &&
+  expected.zip(queryOutput).exists(e => !conf.resolver(e._1.name, 
e._2.name))) {
+  logWarning("The query columns and the table columns have same names but 
different " +
+"orders. You can use INSERT [INTO | OVERWRITE] BY NAME to reorder the 
query columns to " +
+"align with the table columns.")
+}
+  }
+
   private def containsIntegralOrDecimalType(dt: DataType): Boolean = dt match {
 case _: IntegralType | _: DecimalType => true
 case a: ArrayType => conta

[LincolnTalk] Roof Repair

2023-10-23 Thread sarah cannon holden
Does anyone have a suggestion for roof repairs?  We have a couple of
missing shingles and a leak.

Sarah
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[jira] [Updated] (SPARK-45563) Spark history files backend currently depend on polling for loading into the history server

2023-10-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-45563:
-
Affects Version/s: 4.0.0
   (was: 3.3.0)
   (was: 3.5.0)
   (was: 3.4.1)

> Spark history files backend currently depend on polling for loading into the 
> history server
> ---
>
> Key: SPARK-45563
> URL: https://issues.apache.org/jira/browse/SPARK-45563
> Project: Spark
>  Issue Type: Improvement
>  Components: UI
>Affects Versions: 4.0.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> The spark history server FS  currently depends on polling for loading history 
> files but we should support on demand loading as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45563) Spark history files backend currently depend on polling for loading into the history server

2023-10-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-45563:
-
Issue Type: Improvement  (was: Bug)

> Spark history files backend currently depend on polling for loading into the 
> history server
> ---
>
> Key: SPARK-45563
> URL: https://issues.apache.org/jira/browse/SPARK-45563
> Project: Spark
>  Issue Type: Improvement
>  Components: UI
>Affects Versions: 3.3.0, 3.4.1, 3.5.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> The spark history server FS backend currently depends on polling for loading 
> rolling output files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45563) Spark history files backend currently depend on polling for loading into the history server

2023-10-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-45563:
-
Description: The spark history server FS  currently depends on polling for 
loading history files but we should support on demand loading as well.  (was: 
The spark history server FS backend currently depends on polling for loading 
rolling output files)

> Spark history files backend currently depend on polling for loading into the 
> history server
> ---
>
> Key: SPARK-45563
> URL: https://issues.apache.org/jira/browse/SPARK-45563
> Project: Spark
>  Issue Type: Improvement
>  Components: UI
>Affects Versions: 3.3.0, 3.4.1, 3.5.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> The spark history server FS  currently depends on polling for loading history 
> files but we should support on demand loading as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45563) Spark history files backend currently depend on polling for loading into the history server

2023-10-16 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-45563:
-
Summary: Spark history files backend currently depend on polling for 
loading into the history server  (was: Spark rolling history files currently 
depend on polling for loading into the history server)

> Spark history files backend currently depend on polling for loading into the 
> history server
> ---
>
> Key: SPARK-45563
> URL: https://issues.apache.org/jira/browse/SPARK-45563
> Project: Spark
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 3.3.0, 3.4.1, 3.5.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> The spark history server FS backend currently depends on polling for loading 
> rolling output files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45563) Spark rolling history files currently depend on polling for loading into the history server

2023-10-16 Thread Holden Karau (Jira)
Holden Karau created SPARK-45563:


 Summary: Spark rolling history files currently depend on polling 
for loading into the history server
 Key: SPARK-45563
 URL: https://issues.apache.org/jira/browse/SPARK-45563
 Project: Spark
  Issue Type: Bug
  Components: UI
Affects Versions: 3.5.0, 3.4.1, 3.3.0
Reporter: Holden Karau
Assignee: Holden Karau


The spark history server FS backend currently depends on polling for loading 
rolling output files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44735) Log a warning when inserting columns with the same name by row that don't match up

2023-10-13 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau resolved SPARK-44735.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

> Log a warning when inserting columns with the same name by row that don't 
> match up
> --
>
> Key: SPARK-44735
> URL: https://issues.apache.org/jira/browse/SPARK-44735
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.2, 3.5.0, 4.0.0
>Reporter: Holden Karau
>Assignee: Jia Fan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> With SPARK-42750 people can now insert by name, but sometimes people forget 
> it. We should log warning when it *looks like* someone forgot it (e.g. insert 
> by column number with all the same names *but* not matching up in row).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44735) Log a warning when inserting columns with the same name by row that don't match up

2023-10-13 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau reassigned SPARK-44735:


Assignee: Jia Fan

> Log a warning when inserting columns with the same name by row that don't 
> match up
> --
>
> Key: SPARK-44735
> URL: https://issues.apache.org/jira/browse/SPARK-44735
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.2, 3.5.0, 4.0.0
>Reporter: Holden Karau
>Assignee: Jia Fan
>Priority: Minor
>  Labels: pull-request-available
>
> With SPARK-42750 people can now insert by name, but sometimes people forget 
> it. We should log warning when it *looks like* someone forgot it (e.g. insert 
> by column number with all the same names *but* not matching up in row).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[Python-Dev] Re: detecting statements which result is not stored

2023-10-05 Thread Steve Holden
Sounds like you might want the "Python Help" group on https;//
discuss.python.org - the dev conversation migrated there quite a while ago
now, so thighs channel is more or less announcements only.

Good luck with your project!

Kind regards,
Steve


On Thu, 5 Oct 2023 at 16:07, Guenther Sohler 
wrote:

>
> Hi Python Developers, after  some investigation i consider dev the correct
> mailing list.
> I got python embedded into OpenSCAD and i'd like to make more use out of
> it.
>
> statements like these do exactly what they should:
>
> width= 3*5
> solid = make_nice_solid(width)
> other_solid = solid.size(1)
> print(" Everything fine")
>
> But i'd like to be able to write lines like these:  These are expressions
> which create a value,
> which is apparently NOT stored in a variable
>
> solid *3
> other_solid - solid
>
> instead of wasting them, I'd like to collect them in an array and display
> send it to the display engine
> after  python evaluation has  finished.
>
> Is there some way how i can collect the orphaned expressions ?
>
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/D7GI2EAE73OSYOQI7QKQOCB6SRRYOUWV/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MOXK3OETDDEI24VOM2OY726REYBFOVSF/
Code of Conduct: http://python.org/psf/codeofconduct/


Bug#1052991: apt: Missing Release File

2023-09-26 Thread Stuart Holden
Package: apt
Version: 2.2.4
Severity: important
X-Debbugs-Cc: stu...@gmail.com

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

My release Debian 11.7

   sudo apt update - gives The repository 'http://archive.raspbian.org/raspbian 
stretch Release' no longer has a release file
   
As I have a current version do not expect a missing file



*** End of the template - remove these template lines ***


-- Package-specific info:

-- apt-config dump --

APT "";
APT::Architecture "armhf";
APT::Build-Essential "";
APT::Build-Essential:: "build-essential";
APT::Install-Recommends "1";
APT::Install-Suggests "0";
APT::Sandbox "";
APT::Sandbox::User "_apt";
APT::NeverAutoRemove "";
APT::NeverAutoRemove:: "^firmware-linux.*";
APT::NeverAutoRemove:: "^linux-firmware$";
APT::NeverAutoRemove:: "^linux-image-[a-z0-9]*$";
APT::NeverAutoRemove:: "^linux-image-[a-z0-9]*-[a-z0-9]*$";
APT::VersionedKernelPackages "";
APT::VersionedKernelPackages:: "linux-.*";
APT::VersionedKernelPackages:: "kfreebsd-.*";
APT::VersionedKernelPackages:: "gnumach-.*";
APT::VersionedKernelPackages:: ".*-modules";
APT::VersionedKernelPackages:: ".*-kernel";
APT::Never-MarkAuto-Sections "";
APT::Never-MarkAuto-Sections:: "metapackages";
APT::Never-MarkAuto-Sections:: "contrib/metapackages";
APT::Never-MarkAuto-Sections:: "non-free/metapackages";
APT::Never-MarkAuto-Sections:: "restricted/metapackages";
APT::Never-MarkAuto-Sections:: "universe/metapackages";
APT::Never-MarkAuto-Sections:: "multiverse/metapackages";
APT::Move-Autobit-Sections "";
APT::Move-Autobit-Sections:: "oldlibs";
APT::Move-Autobit-Sections:: "contrib/oldlibs";
APT::Move-Autobit-Sections:: "non-free/oldlibs";
APT::Move-Autobit-Sections:: "restricted/oldlibs";
APT::Move-Autobit-Sections:: "universe/oldlibs";
APT::Move-Autobit-Sections:: "multiverse/oldlibs";
APT::LastInstalledKernel "6.1.21-v8+";
APT::Periodic "";
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Update "";
APT::Update::Post-Invoke-Success "";
APT::Update::Post-Invoke-Success:: "/usr/bin/test -e 
/usr/share/dbus-1/system-services/org.freedesktop.PackageKit.service && 
/usr/bin/test -S /var/run/dbus/system_bus_socket && /usr/bin/gdbus call 
--system --dest org.freedesktop.PackageKit --object-path 
/org/freedesktop/PackageKit --timeout 4 --method 
org.freedesktop.PackageKit.StateHasChanged cache-update > /dev/null; /bin/echo 
> /dev/null";
APT::Architectures "";
APT::Architectures:: "armhf";
APT::Compressor "";
APT::Compressor::. "";
APT::Compressor::.::Name ".";
APT::Compressor::.::Extension "";
APT::Compressor::.::Binary "";
APT::Compressor::.::Cost "0";
APT::Compressor::zstd "";
APT::Compressor::zstd::Name "zstd";
APT::Compressor::zstd::Extension ".zst";
APT::Compressor::zstd::Binary "false";
APT::Compressor::zstd::Cost "60";
APT::Compressor::lz4 "";
APT::Compressor::lz4::Name "lz4";
APT::Compressor::lz4::Extension ".lz4";
APT::Compressor::lz4::Binary "false";
APT::Compressor::lz4::Cost "50";
APT::Compressor::gzip "";
APT::Compressor::gzip::Name "gzip";
APT::Compressor::gzip::Extension ".gz";
APT::Compressor::gzip::Binary "gzip";
APT::Compressor::gzip::Cost "100";
APT::Compressor::gzip::CompressArg "";
APT::Compressor::gzip::CompressArg:: "-6n";
APT::Compressor::gzip::UncompressArg "";
APT::Compressor::gzip::UncompressArg:: "-d";
APT::Compressor::xz "";
APT::Compressor::xz::Name "xz";
APT::Compressor::xz::Extension ".xz";
APT::Compressor::xz::Binary "xz";
APT::Compressor::xz::Cost "200";
APT::Compressor::xz::CompressArg "";
APT::Compressor::xz::CompressArg:: "-6";
APT::Compressor::xz::UncompressArg "";
APT::Compressor::xz::UncompressArg:: "-d";
APT::Compressor::bzip2 "";
APT::Compressor::bzip2::Name "bzip2";
APT::Compressor::bzip2::Extension ".bz2";
APT::Compressor::bzip2::Binary "bzip2";
APT::Compressor::bzip2::Cost "300";
APT::Compressor::bzip2::CompressArg "";
APT::Compressor::bzip2::CompressArg:: "-6";
APT::Compressor::bzip2::UncompressArg "";
APT::Compressor::bzip2::UncompressArg:: "-d";
APT::Compressor::lzma "";
APT::Compressor::lzma::Name "lzma";
APT::Compressor::lzma::Extension ".lzma";
APT::Compressor::lzma::Binary "xz";
APT::Compressor::lzma::Cost "400";
APT::Compressor::lzma::CompressArg "";
APT::Compressor::lzma::CompressArg:: "--format=lzma";
APT::Compressor::lzma::CompressArg:: "-6";
APT::Compressor::lzma::UncompressArg "";
APT::Compressor::lzma::UncompressArg:: "--format=lzma";
APT::Compressor::lzma::UncompressArg:: "-d";
Dir "/";
Dir::State "var/lib/apt";
Dir::State::lists "lists/";
Dir::State::cdroms "cdroms.list";
Dir::State::extended_states "extended_states";
Dir::State::status "/var/lib/dpkg/status";
Dir::Cache "var/cache/apt";
Dir::Cache::archives "archives/";
Dir::Cache::srcpkgcache "srcpkgcache.bin";
Dir::Cache::pkgcache "pkgcache.bin";
Dir::Etc "etc/apt";
Dir::Etc::sourcelist "sources.list";
Dir::Etc::sourceparts "sources.list.d";
Dir::Etc::main "apt.conf";
Dir::Etc::netrc "auth.conf";

[LincolnTalk] Power at 1:59 am

2023-09-14 Thread Sarah Cannon Holden



Sarah Cannon Holden
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



[LincolnTalk] Power outage

2023-09-13 Thread Sarah Cannon Holden
I’m on Weston Rd. without power. What about others?

Sarah Cannon Holden
-- 
The LincolnTalk mailing list.
To post, send mail to Lincoln@lincolntalk.org.
Browse the archives at https://pairlist9.pair.net/mailman/private/lincoln/.
Change your subscription settings at 
https://pairlist9.pair.net/mailman/listinfo/lincoln.



Re: Write Spark Connection client application in Go

2023-09-12 Thread Holden Karau
That’s so cool! Great work y’all :)

On Tue, Sep 12, 2023 at 8:14 PM bo yang  wrote:

> Hi Spark Friends,
>
> Anyone interested in using Golang to write Spark application? We created a 
> Spark
> Connect Go Client library .
> Would love to hear feedback/thoughts from the community.
>
> Please see the quick start guide
> 
> about how to use it. Following is a very short Spark Connect application in
> Go:
>
> func main() {
>   spark, _ := 
> sql.SparkSession.Builder.Remote("sc://localhost:15002").Build()
>   defer spark.Stop()
>
>   df, _ := spark.Sql("select 'apple' as word, 123 as count union all 
> select 'orange' as word, 456 as count")
>   df.Show(100, false)
>   df.Collect()
>
>   df.Write().Mode("overwrite").
>   Format("parquet").
>   Save("file:///tmp/spark-connect-write-example-output.parquet")
>
>   df = spark.Read().Format("parquet").
>   Load("file:///tmp/spark-connect-write-example-output.parquet")
>   df.Show(100, false)
>
>   df.CreateTempView("view1", true, false)
>   df, _ = spark.Sql("select count, word from view1 order by count")
> }
>
>
> Many thanks to Martin, Hyukjin, Ruifeng and Denny for creating and working
> together on this repo! Welcome more people to contribute :)
>
> Best,
> Bo
>
>


Re: Write Spark Connection client application in Go

2023-09-12 Thread Holden Karau
That’s so cool! Great work y’all :)

On Tue, Sep 12, 2023 at 8:14 PM bo yang  wrote:

> Hi Spark Friends,
>
> Anyone interested in using Golang to write Spark application? We created a 
> Spark
> Connect Go Client library .
> Would love to hear feedback/thoughts from the community.
>
> Please see the quick start guide
> 
> about how to use it. Following is a very short Spark Connect application in
> Go:
>
> func main() {
>   spark, _ := 
> sql.SparkSession.Builder.Remote("sc://localhost:15002").Build()
>   defer spark.Stop()
>
>   df, _ := spark.Sql("select 'apple' as word, 123 as count union all 
> select 'orange' as word, 456 as count")
>   df.Show(100, false)
>   df.Collect()
>
>   df.Write().Mode("overwrite").
>   Format("parquet").
>   Save("file:///tmp/spark-connect-write-example-output.parquet")
>
>   df = spark.Read().Format("parquet").
>   Load("file:///tmp/spark-connect-write-example-output.parquet")
>   df.Show(100, false)
>
>   df.CreateTempView("view1", true, false)
>   df, _ = spark.Sql("select count, word from view1 order by count")
> }
>
>
> Many thanks to Martin, Hyukjin, Ruifeng and Denny for creating and working
> together on this repo! Welcome more people to contribute :)
>
> Best,
> Bo
>
>


Re: [VOTE] Release Apache Spark 3.5.0 (RC4)

2023-09-07 Thread Holden Karau
+1 pip installing seems to function :)

On Thu, Sep 7, 2023 at 7:22 PM Yuming Wang  wrote:

> +1.
>
> On Thu, Sep 7, 2023 at 10:33 PM yangjie01 
> wrote:
>
>> +1
>>
>>
>>
>> *发件人**: *Gengliang Wang 
>> *日期**: *2023年9月7日 星期四 12:53
>> *收件人**: *Yuanjian Li 
>> *抄送**: *Xiao Li , "her...@databricks.com.invalid"
>> , Spark dev list 
>> *主题**: *Re: [VOTE] Release Apache Spark 3.5.0 (RC4)
>>
>>
>>
>> +1
>>
>>
>>
>> On Wed, Sep 6, 2023 at 9:46 PM Yuanjian Li 
>> wrote:
>>
>> +1 (non-binding)
>>
>> Xiao Li  于2023年9月6日周三 15:27写道:
>>
>> +1
>>
>>
>>
>> Xiao
>>
>>
>>
>> Herman van Hovell  于2023年9月6日周三 22:08写道:
>>
>> Tested connect, and everything looks good.
>>
>>
>>
>> +1
>>
>>
>>
>> On Wed, Sep 6, 2023 at 8:11 AM Yuanjian Li 
>> wrote:
>>
>> Please vote on releasing the following candidate(RC4) as Apache Spark
>> version 3.5.0.
>>
>>
>>
>> The vote is open until 11:59pm Pacific time *Sep 8th* and passes if a
>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>
>>
>>
>> [ ] +1 Release this package as Apache Spark 3.5.0
>>
>> [ ] -1 Do not release this package because ...
>>
>>
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>>
>>
>> The tag to be voted on is v3.5.0-rc4 (commit
>> c2939589a29dd0d6a2d3d31a8d833877a37ee02a):
>>
>> https://github.com/apache/spark/tree/v3.5.0-rc4
>>
>>
>>
>> The release files, including signatures, digests, etc. can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc4-bin/
>>
>>
>>
>> Signatures used for Spark RCs can be found in this file:
>>
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>>
>>
>> The staging repository for this release can be found at:
>>
>> https://repository.apache.org/content/repositories/orgapachespark-1448
>>
>>
>>
>> The documentation corresponding to this release can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc4-docs/
>>
>>
>>
>> The list of bug fixes going into 3.5.0 can be found at the following URL:
>>
>> https://issues.apache.org/jira/projects/SPARK/versions/12352848
>>
>>
>>
>> This release is using the release script of the tag v3.5.0-rc4.
>>
>>
>>
>> FAQ
>>
>>
>>
>> =
>>
>> How can I help test this release?
>>
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>>
>> an existing Spark workload and running on this release candidate, then
>>
>> reporting any regressions.
>>
>>
>>
>> If you're working in PySpark you can set up a virtual env and install
>>
>> the current RC and see if anything important breaks, in the Java/Scala
>>
>> you can add the staging repository to your projects resolvers and test
>>
>> with the RC (make sure to clean up the artifact cache before/after so
>>
>> you don't end up building with an out of date RC going forward).
>>
>>
>>
>> ===
>>
>> What should happen to JIRA tickets still targeting 3.5.0?
>>
>> ===
>>
>> The current list of open tickets targeted at 3.5.0 can be found at:
>>
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.5.0
>>
>>
>>
>> Committers should look at those and triage. Extremely important bug
>>
>> fixes, documentation, and API tweaks that impact compatibility should
>>
>> be worked on immediately. Everything else please retarget to an
>>
>> appropriate release.
>>
>>
>>
>> ==
>>
>> But my bug isn't fixed?
>>
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>>
>> release unless the bug in question is a regression from the previous
>>
>> release. That being said, if there is something which is a regression
>>
>> that has not been correctly targeted please ping me or a committer to
>>
>> help target the issue.
>>
>>
>>
>> Thanks,
>>
>> Yuanjian Li
>>
>>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [VOTE] Release Apache Spark 3.5.0 (RC3)

2023-09-02 Thread Holden Karau
Can we delay the next RC cut until after Labor Day?

On Sat, Sep 2, 2023 at 9:59 PM Yuanjian Li  wrote:

> Thank you for all the reports!
> The vote has failed. I plan to cut RC4 in two days.
>
> @Dipayan Dev  I quickly skimmed through the
> corresponding ticket, and it doesn't seem to be a regression introduced in
> 3.5. Additionally, someone is asking if this is the same issue as
> SPARK-35279.
> @Yuming Wang  I will check the signature for RC4
> @Jungtaek Lim  I will follow-up with you
> regarding SPARK-45045 
> @Wenchen Fan  Agree, we should include the
> correctness fix in 3.5
>
> Jungtaek Lim  于2023年8月31日周四 23:45写道:
>
>> My apologies, I have to add another ticket for a blocker, SPARK-45045
>> . That said, I'm -1
>> (non-binding).
>>
>> SPARK-43183  made a
>> behavioral change regarding the StreamingQueryListener as well as
>> StreamingQuery API as a side-effect, while the intention was more about
>> introducing the change in the former one. I just got some reports that the
>> behavioral change for StreamingQuery API broke various tests in 3rd party
>> data sources. To help 3rd party ecosystems to adopt 3.5 without hassle, I'd
>> like to see this be fixed in 3.5.0.
>>
>> There is no fix yet but I'm working on it. I'll give an update here.
>> Maybe we could lower down priority and let the release go with describing
>> this as a "known issue", if I couldn't make progress in a couple of days.
>> I'm sorry about that.
>>
>> Thanks,
>> Jungtaek Lim
>>
>> On Fri, Sep 1, 2023 at 12:12 PM Wenchen Fan  wrote:
>>
>>> Sorry for the last-minute bug report, but we found a regression in 3.5:
>>> the SQL INSERT command without a column list fills missing columns with
>>> NULL while Spark 3.4 does not allow it. According to the SQL standard, this
>>> shouldn't be allowed and thus a regression in 3.5.
>>>
>>> The fix has been merged but one day after the RC3 cut:
>>> https://github.com/apache/spark/pull/42393 . I'm -1 and let's include
>>> this fix in 3.5.
>>>
>>> Thanks,
>>> Wenchen
>>>
>>> On Thu, Aug 31, 2023 at 9:09 PM Ian Manning 
>>> wrote:
>>>
 +1 (non-binding)

 Using Spark Core, Spark SQL, Structured Streaming.

 On Tue, Aug 29, 2023 at 8:12 PM Yuanjian Li 
 wrote:

> Please vote on releasing the following candidate(RC3) as Apache Spark
> version 3.5.0.
>
> The vote is open until 11:59pm Pacific time Aug 31st and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.5.0
>
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.5.0-rc3 (commit
> 9f137aa4dc43398aafa0c3e035ed3174182d7d6c):
>
> https://github.com/apache/spark/tree/v3.5.0-rc3
>
> The release files, including signatures, digests, etc. can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
>
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1447
>
> The documentation corresponding to this release can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc3-docs/
>
> The list of bug fixes going into 3.5.0 can be found at the following
> URL:
>
> https://issues.apache.org/jira/projects/SPARK/versions/12352848
>
> This release is using the release script of the tag v3.5.0-rc3.
>
>
> FAQ
>
> =
>
> How can I help test this release?
>
> =
>
> If you are a Spark user, you can help us test this release by taking
>
> an existing Spark workload and running on this release candidate, then
>
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
>
> the current RC and see if anything important breaks, in the Java/Scala
>
> you can add the staging repository to your projects resolvers and test
>
> with the RC (make sure to clean up the artifact cache before/after so
>
> you don't end up building with an out of date RC going forward).
>
> ===
>
> What should happen to JIRA tickets still targeting 3.5.0?
>
> ===
>
> The current list of open tickets targeted at 3.5.0 can be found at:
>
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.5.0
>
> Committers should look at those 

[Spectacle] [Bug 473931] New: Consistent Spectacle Crashes after disconnecting secondary monitor

2023-08-30 Thread Holden
https://bugs.kde.org/show_bug.cgi?id=473931

Bug ID: 473931
   Summary: Consistent Spectacle Crashes after disconnecting
secondary monitor
Classification: Applications
   Product: Spectacle
   Version: 23.04.3
  Platform: openSUSE
OS: Linux
Status: REPORTED
  Keywords: drkonqi
  Severity: crash
  Priority: NOR
 Component: General
  Assignee: noaha...@gmail.com
  Reporter: holdenrf2...@gmail.com
CC: k...@david-redondo.de
  Target Milestone: ---

Application: spectacle (23.04.3)

Qt Version: 5.15.10
Frameworks Version: 5.109.0
Operating System: Linux 6.4.11-1-default x86_64
Windowing System: Wayland
Distribution: "openSUSE Tumbleweed"
DrKonqi: 5.27.7 [KCrashBackend]

-- Information about the crash:
Honestly happens pretty much every time I disconnect HDMI that I hardly notice
it. Pretty graceful failure and just pops up in the taskbar and I usually
ignore. Possibly related to some larger infrastructural Wayland/Pipewire issues
I'm seeing that are causing massive heatups of my CPU cores, especially with
things like X forwarding over ssh, remote connection software (mostly
connectwise that I have to use for work until I integrate a better solution for
our company), scaling certain apps, etc. Likely in large part due to the amount
of weird configurations I have going on for work and general exploration. That
one is really a pain in the ass and I basically have to dump all my screen
connection software to keep my CPU cores from getting well above 80C and
throwing the fans on warp speed 9. If contacted about this I'm glad to generate
dumps for that, no real crashes so it hasn't popped up to prompt a bug report
and I've been too busy to track it down and report manually.

The crash can be reproduced every time.

-- Backtrace:
Application: Spectacle (spectacle), signal: Segmentation fault

[KCrash Handler]
#4  0x5570caad7a30 in ?? ()
#5  0x7ff31c725812 in QtPrivate::QSlotObjectBase::call (a=0x7fffe84f7100,
r=0x7fffe84f7b50, this=0x5570cab9ec20) at
../../include/QtCore/../../src/corelib/kernel/qobjectdefs_impl.h:398
#6  doActivate (sender=0x7fffe84f7af0, signal_index=10,
argv=0x7fffe84f7100) at kernel/qobject.cpp:3925
#7  0x7ff31c71e47f in QMetaObject::activate (sender=,
m=m@entry=0x7ff31d1167c0, local_signal_index=local_signal_index@entry=2,
argv=argv@entry=0x7fffe84f7100) at kernel/qobject.cpp:3985
#8  0x7ff31cb6f012 in QGuiApplication::screenRemoved (this=,
_t1=, _t1@entry=0x5570ca8cc4f0) at
.moc/moc_qguiapplication.cpp:396
#9  0x7ff31cba6e88 in QScreen::~QScreen (this=0x5570ca8cc4f0,
__in_chrg=) at
../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:116
#10 0x7ff31cba6ff9 in QScreen::~QScreen (this=0x5570ca8cc4f0,
__in_chrg=) at kernel/qscreen.cpp:178
#11 0x7ff31cb4f4b7 in QWindowSystemInterface::handleScreenRemoved
(platformScreen=0x5570ca92a1d0) at kernel/qwindowsysteminterface.cpp:844
#12 0x7ff31e7acc02 in
QtWaylandClient::QWaylandDisplay::registry_global_remove (this=0x5570ca90f640,
id=49) at qwaylanddisplay.cpp:571
#13 0x7ff31d6a8962 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#14 0x7ff31d6a52df in ffi_call_int (cif=cif@entry=0x7fffe84f73e0,
fn=, rvalue=, avalue=,
closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#15 0x7ff31d6a7f26 in ffi_call (cif=cif@entry=0x7fffe84f73e0, fn=, rvalue=rvalue@entry=0x0, avalue=avalue@entry=0x7fffe84f74b0) at
../src/x86/ffi64.c:710
#16 0x7ff31ee35a23 in wl_closure_invoke
(closure=closure@entry=0x7ff308005770, target=,
target@entry=0x5570ca913940, opcode=opcode@entry=1, data=,
flags=1) at ../src/connection.c:1025
#17 0x7ff31ee36203 in dispatch_event (display=0x5570ca9137c0,
queue=) at ../src/wayland-client.c:1631
#18 0x7ff31ee36494 in dispatch_queue (queue=0x5570ca9138b0,
display=0x5570ca9137c0) at ../src/wayland-client.c:1777
#19 wl_display_dispatch_queue_pending (display=0x5570ca9137c0,
queue=0x5570ca9138b0) at ../src/wayland-client.c:2019
#20 0x7ff31e7a9a12 in QtWaylandClient::QWaylandDisplay::flushRequests
(this=) at qwaylanddisplay.cpp:255
#21 0x7ff31c719320 in QObject::event (this=0x5570ca90f640,
e=0x7ff3040027b0) at kernel/qobject.cpp:1347
#22 0x7ff31dfa519e in QApplicationPrivate::notify_helper (this=, receiver=0x5570ca90f640, e=0x7ff3040027b0) at
kernel/qapplication.cpp:3640
#23 0x7ff31c6ed568 in QCoreApplication::notifyInternal2
(receiver=0x5570ca90f640, event=0x7ff3040027b0) at
kernel/qcoreapplication.cpp:1064
#24 0x7ff31c6ed72e in QCoreApplication::sendEvent (receiver=, event=) at kernel/qcoreapplication.cpp:1462
#25 0x7ff31c6f0b61 in QCoreApplicationPrivate::sendPostedEvents
(receiver=0x0, event_type=0, data=0x5570ca8de8e0) at
kernel/qcoreapplication.cpp:1821
#26 0x7ff31c6f10a8 in QCoreApplication::sendPostedEvents
(receiver=, event_type=) at
kernel/qcoreapplication.cpp:1680
#27 0x7ff31c746c93 in 

[jira] [Created] (SPARK-44992) Add support for rack information from an environment variable

2023-08-28 Thread Holden Karau (Jira)
Holden Karau created SPARK-44992:


 Summary: Add support for rack information from an environment 
variable
 Key: SPARK-44992
 URL: https://issues.apache.org/jira/browse/SPARK-44992
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau


This would allow us to use things like EC2_AVAILABILITY_ZONE for locality for 
Kube (or other clusters) which span multiple AZs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Elasticsearch support for Spark 3.x

2023-08-27 Thread Holden Karau
What’s the version of the ES connector you are using?

On Sat, Aug 26, 2023 at 10:17 AM Dipayan Dev 
wrote:

> Hi All,
>
> We're using Spark 2.4.x to write dataframe into the Elasticsearch index.
> As we're upgrading to Spark 3.3.0, it throwing out error
> Caused by: java.lang.ClassNotFoundException: es.DefaultSource
> at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
>
> Looking at a few responses from Stackoverflow
> . it seems this is not yet
> supported by Elasticsearch-hadoop.
>
> Does anyone have experience with this? Or faced/resolved this issue in
> Spark 3?
>
> Thanks in advance!
>
> Regards
> Dipayan
>
-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


[jira] [Created] (SPARK-44970) Spark History File Uploads Can Fail on S3

2023-08-25 Thread Holden Karau (Jira)
Holden Karau created SPARK-44970:


 Summary: Spark History File Uploads Can Fail on S3
 Key: SPARK-44970
 URL: https://issues.apache.org/jira/browse/SPARK-44970
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau
Assignee: Holden Karau


Sometimes if the driver OOMs the history log will not upload finish.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44955) Add the option for dynamically marking containers for preemption based data

2023-08-24 Thread Holden Karau (Jira)
Holden Karau created SPARK-44955:


 Summary: Add the option for dynamically marking containers for 
preemption based data
 Key: SPARK-44955
 URL: https://issues.apache.org/jira/browse/SPARK-44955
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44954) Make DEA algorithms pluggable

2023-08-24 Thread Holden Karau (Jira)
Holden Karau created SPARK-44954:


 Summary: Make DEA algorithms pluggable
 Key: SPARK-44954
 URL: https://issues.apache.org/jira/browse/SPARK-44954
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44953) Log a warning (or automatically disable) when shuffle tracking is enabled along side another DA supported mechanism

2023-08-24 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-44953:
-
Parent: SPARK-44951
Issue Type: Sub-task  (was: Improvement)

> Log a warning (or automatically disable) when shuffle tracking is enabled 
> along side another DA supported mechanism
> ---
>
> Key: SPARK-44953
> URL: https://issues.apache.org/jira/browse/SPARK-44953
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Holden Karau
>Priority: Major
>
> Some people enable both shuffle tracking and another mechanism (like 
> migration) and then are confused when their jobs don't scale down.
>  
> We should at least log a warning here (or automatically disable shuffle 
> tracking?) when it is configured alongside another DA supported mechanism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44953) Log a warning (or automatically disable) when shuffle tracking is enabled along side another DA supported mechanism

2023-08-24 Thread Holden Karau (Jira)
Holden Karau created SPARK-44953:


 Summary: Log a warning (or automatically disable) when shuffle 
tracking is enabled along side another DA supported mechanism
 Key: SPARK-44953
 URL: https://issues.apache.org/jira/browse/SPARK-44953
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau


Some people enable both shuffle tracking and another mechanism (like migration) 
and then are confused when their jobs don't scale down.

 

We should at least log a warning here (or automatically disable shuffle 
tracking?) when it is configured alongside another DA supported mechanism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44951) Improve Spark Dynamic Allocation

2023-08-24 Thread Holden Karau (Jira)
Holden Karau created SPARK-44951:


 Summary: Improve Spark Dynamic Allocation
 Key: SPARK-44951
 URL: https://issues.apache.org/jira/browse/SPARK-44951
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes, Spark Core, YARN
Affects Versions: 4.0.0
Reporter: Holden Karau


For Spark 4 we should aim to improve Spark's dynamic allocation. Some potential 
ideas here includes the following:
 * Plug-gable DEA algorithms
 * How to reduce wastage on the RM side? Sometimes the driver asks for some 
units of resources. But when RM provisions them, the driver cancels it. 
 * Support for "warm" executor pools which are not tied to a particular driver 
but start and wait for a driver to connect to them to "claim" them.
 * More explicit Cost Vs AppRunTime confiugration: A good DEA algo should allow 
the developer to choose between cost and runtime. Sometimes developers might be 
ok to pay higher costs for faster execution.
 * Use previous run information to inform future runs
 * Better selection of executors to be scaled down



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44950) Improve Spark Driver Launch Time

2023-08-24 Thread Holden Karau (Jira)
Holden Karau created SPARK-44950:


 Summary: Improve Spark Driver Launch Time
 Key: SPARK-44950
 URL: https://issues.apache.org/jira/browse/SPARK-44950
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes, Spark Core
Affects Versions: 4.0.0
Reporter: Holden Karau


We should try and improve Spark's driver launch time as it can serve as a 
bottleneck for smaller queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [Internet]Re: Improving Dynamic Allocation Logic for Spark 4+

2023-08-23 Thread Holden Karau
One option could be to initially launch both drivers and initial executors
(using the lazy executor ID allocation), but it would introduce a lot of
complexity.

On Wed, Aug 23, 2023 at 6:44 PM Qian Sun  wrote:

> Hi Mich
>
> I agree with your opinion that the startup time of the Spark on Kubernetes
> cluster needs to be improved.
>
> Regarding the fetching image directly, I have utilized ImageCache to store
> the images on the node, eliminating the time required to pull images from a
> remote repository, which does indeed lead to a reduction in overall time,
> and the effect becomes more pronounced as the size of the image increases.
>
>
> Additionally, I have observed that the driver pod takes a significant
> amount of time from running to attempting to create executor pods, with an
> estimated time expenditure of around 75%. We can also explore optimization
> options in this area.
>
> On Thu, Aug 24, 2023 at 12:58 AM Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi all,
>>
>> On this conversion, one of the issues I brought up was the driver
>> start-up time. This is especially true in k8s. As spark on k8s is modeled
>> on Spark on standalone schedler, Spark on k8s consist of a single-driver
>> pod (as master on standalone”) and a  number of executors (“workers”). When 
>> executed
>> on k8s, the driver and executors are executed on separate pods
>> <https://spark.apache.org/docs/latest/running-on-kubernetes.html>. First
>> the driver pod is launched, then the driver pod itself launches the
>> executor pods. From my observation, in an auto scaling cluster, the driver
>> pod may take up to 40 seconds followed by executor pods. This is a
>> considerable time for customers and it is painfully slow. Can we actually
>> move away from dependency on standalone mode and try to speed up k8s
>> cluster formation.
>>
>> Another naive question, when the docker image is pulled from the
>> container registry to the driver itself, this takes finite time. The docker
>> image for executors could be different from that of the driver
>> docker image. Since spark-submit presents this at the time of submission,
>> can we save time by fetching the docker images straight away?
>>
>> Thanks
>>
>> Mich
>>
>>
>>view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Tue, 8 Aug 2023 at 18:25, Mich Talebzadeh 
>> wrote:
>>
>>> Splendid idea. 
>>>
>>> Mich Talebzadeh,
>>> Solutions Architect/Engineering Lead
>>> London
>>> United Kingdom
>>>
>>>
>>>view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Tue, 8 Aug 2023 at 18:10, Holden Karau  wrote:
>>>
>>>> The driver it’s self is probably another topic, perhaps I’ll make a
>>>> “faster spark star time” JIRA and a DA JIRA and we can explore both.
>>>>
>>>> On Tue, Aug 8, 2023 at 10:07 AM Mich Talebzadeh <
>>>> mich.talebza...@gmail.com> wrote:
>>>>
>>>>> From my own perspective faster execution time especially with Spark on
>>>>> tin boxes (Dataproc & EC2) and Spark on k8s is something that customers
>>>>> often bring up.
>>>>>
>>>>> Poor time to onboard with autoscaling seems to be particularly singled
>>>>> out for heavy ETL jobs that use Spark. I am disappointed to see the poor
>>>>> performance of Spark on k8s autopilot with timelines starting the driver
>&g

[jira] [Created] (SPARK-44769) Add SQL statement to create an empty array with a type

2023-08-10 Thread Holden Karau (Jira)
Holden Karau created SPARK-44769:


 Summary: Add SQL statement to create an empty array with a type
 Key: SPARK-44769
 URL: https://issues.apache.org/jira/browse/SPARK-44769
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: Holden Karau






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42035) Add a config flag to force exit on JDK major version mismatch

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42035:
-
Target Version/s: 4.0.0

> Add a config flag to force exit on JDK major version mismatch
> -
>
> Key: SPARK-42035
> URL: https://issues.apache.org/jira/browse/SPARK-42035
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> JRE version mismatches can cause errors which are difficult to debug 
> (potentially correctness with serialization issues). We should add a flag for 
> platform which wish to "fail fast" and exit on major version mismatch.
>  
> I think this could be a good thing to have default on in Spark 4.
>  
> Generally I expect to see more folks upgrading JRE & JDKs in the coming few 
> years.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42261) K8s will not allocate more execs if there are any pending execs until next snapshot

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42261:
-
Target Version/s: 4.0.0

> K8s will not allocate more execs if there are any pending execs until next 
> snapshot
> ---
>
> Key: SPARK-42261
> URL: https://issues.apache.org/jira/browse/SPARK-42261
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0, 3.4.1, 3.5.0
>Reporter: Holden Karau
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44511) Allow insertInto to succeed with partion columns specified when they match those on the target table

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-44511:
-
Target Version/s: 4.0.0

> Allow insertInto to succeed with partion columns specified when they match 
> those on the target table
> 
>
> Key: SPARK-44511
> URL: https://issues.apache.org/jira/browse/SPARK-44511
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Holden Karau
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42361) Add an option to use external storage to distribute JAR set in cluster mode on Kube

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42361:
-
Target Version/s: 4.0.0

> Add an option to use external storage to distribute JAR set in cluster mode 
> on Kube
> ---
>
> Key: SPARK-42361
> URL: https://issues.apache.org/jira/browse/SPARK-42361
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.5.0
>Reporter: Holden Karau
>Priority: Minor
>
> tl;dr – sometimes the driver can get overwhelmed serving the initial jar set. 
> You'll see a lot of "Executor fetching spark://.../jar" and then 
> connection timed out.
>  
> On YARN the jars (in cluster mode) are cached in HDFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42260) Log when the K8s Exec Pods Allocator Stalls

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42260:
-
Target Version/s: 4.0.0

> Log when the K8s Exec Pods Allocator Stalls
> ---
>
> Key: SPARK-42260
> URL: https://issues.apache.org/jira/browse/SPARK-42260
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Minor
>
> Sometimes if the K8s APIs are being slow the ExecutorPods allocator can stall 
> and it would be good for us to log this (and how long we've stalled for) so 
> folks can tell more clearly why Spark is unable to reach the desired target 
> number of executors.
>  
> This is _somewhat_ related to SPARK-36664 which logs the time spent waiting 
> for executor allocation but goes a step further for K8s and logs when we've 
> stalled because we have too many pending pods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44727) Improve the error message for dynamic allocation conditions

2023-08-09 Thread Holden Karau (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752496#comment-17752496
 ] 

Holden Karau commented on SPARK-44727:
--

Do you have more context [~chengpan] ?

> Improve the error message for dynamic allocation conditions
> ---
>
> Key: SPARK-44727
> URL: https://issues.apache.org/jira/browse/SPARK-44727
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.5.0
>Reporter: Cheng Pan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42035) Add a config flag to force exit on JDK major version mismatch

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-42035:
-
Description: 
JRE version mismatches can cause errors which are difficult to debug 
(potentially correctness with serialization issues). We should add a flag for 
platform which wish to "fail fast" and exit on major version mismatch.

 

I think this could be a good thing to have default on in Spark 4.

 

Generally I expect to see more folks upgrading JRE & JDKs in the coming few 
years.

  was:JRE version mismatches can cause errors which are difficult to debug 
(potentially correctness with serialization issues). We should add a flag for 
platform which wish to "fail fast" and exit on major version mismatch.


> Add a config flag to force exit on JDK major version mismatch
> -
>
> Key: SPARK-42035
> URL: https://issues.apache.org/jira/browse/SPARK-42035
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>    Reporter: Holden Karau
>Assignee: Holden Karau
>Priority: Major
>
> JRE version mismatches can cause errors which are difficult to debug 
> (potentially correctness with serialization issues). We should add a flag for 
> platform which wish to "fail fast" and exit on major version mismatch.
>  
> I think this could be a good thing to have default on in Spark 4.
>  
> Generally I expect to see more folks upgrading JRE & JDKs in the coming few 
> years.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-34337) Reject disk blocks when out of disk space

2023-08-09 Thread Holden Karau (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-34337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Holden Karau updated SPARK-34337:
-
Target Version/s: 4.0.0

> Reject disk blocks when out of disk space
> -
>
> Key: SPARK-34337
> URL: https://issues.apache.org/jira/browse/SPARK-34337
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.1, 3.1.2, 3.2.0
>Reporter: Holden Karau
>Priority: Major
>
> Now that we have the ability to store shuffle blocks on dis-aggregated 
> storage (when configured) we should add the option to reject storing blocks 
> locally on an executor at a certain disk pressure threshold.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44735) Log a warning when inserting columns with the same name by row that don't match up

2023-08-08 Thread Holden Karau (Jira)
Holden Karau created SPARK-44735:


 Summary: Log a warning when inserting columns with the same name 
by row that don't match up
 Key: SPARK-44735
 URL: https://issues.apache.org/jira/browse/SPARK-44735
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4.2, 3.5.0, 4.0.0
Reporter: Holden Karau


With SPARK-42750 people can now insert by name, but sometimes people forget it. 
We should log warning when it *looks like* someone forgot it (e.g. insert by 
column number with all the same names *but* not matching up in row).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >