Re: Improving metadata in Spark JIRA

2015-02-06 Thread Patrick Wendell
Per Nick's suggestion I added two components:

1. Spark Submit
2. Spark Scheduler

I figured I would just add these since if we decide later we don't
want them, we can simply merge them into Spark Core.

On Fri, Feb 6, 2015 at 11:53 AM, Nicholas Chammas
 wrote:
> Do we need some new components to be added to the JIRA project?
>
> Like:
>
>-
>
>scheduler
> -
>
>YARN
> - spark-submit
>- ...?
>
> Nick
>
>
> On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> +9000 on cleaning up JIRA.
>>
>> Thank you Sean for laying out some specific things to tackle. I will
>> assist with this.
>>
>> Regarding email, I think Sandy is right. I only get JIRA email for issues
>> I'm watching.
>>
>> Nick
>>
>> On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza 
>> wrote:
>>
>>> JIRA updates don't go to this list, they go to iss...@spark.apache.org.
>>> I
>>> don't think many are signed up for that list, and those that are probably
>>> have a flood of emails anyway.
>>>
>>> So I'd definitely be in favor of any JIRA cleanup that you're up for.
>>>
>>> -Sandy
>>>
>>> On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen  wrote:
>>>
>>> > I've wasted no time in wielding the commit bit to complete a number of
>>> > small, uncontroversial changes. I wouldn't commit anything that didn't
>>> > already appear to have review, consensus and little risk, but please
>>> > let me know if anything looked a little too bold, so I can calibrate.
>>> >
>>> >
>>> > Anyway, I'd like to continue some small house-cleaning by improving
>>> > the state of JIRA's metadata, in order to let it give us a little
>>> > clearer view on what's happening in the project:
>>> >
>>> > a. Add Component to every (open) issue that's missing one
>>> > b. Review all Critical / Blocker issues to de-escalate ones that seem
>>> > obviously neither
>>> > c. Correct open issues that list a Fix version that has already been
>>> > released
>>> > d. Close all issues Resolved for a release that has already been
>>> released
>>> >
>>> > The problem with doing so is that it will create a tremendous amount
>>> > of email to the list, like, several hundred. It's possible to make
>>> > bulk changes and suppress e-mail though, which could be done for all
>>> > but b.
>>> >
>>> > Better to suppress the emails when making such changes? or just not
>>> > bother on some of these?
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: dev-h...@spark.apache.org
>>> >
>>> >
>>>
>>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Improving metadata in Spark JIRA

2015-02-06 Thread Hari Shreedharan
+1. Jira cleanup would be good. Please let me know if I can help in some way!




Thanks, Hari

On Fri, Feb 6, 2015 at 11:56 AM, Nicholas Chammas
 wrote:

> Do we need some new components to be added to the JIRA project?
> Like:
>-
>scheduler
> -
>YARN
> - spark-submit
>- …?
> Nick
> ​
> On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>> +9000 on cleaning up JIRA.
>>
>> Thank you Sean for laying out some specific things to tackle. I will
>> assist with this.
>>
>> Regarding email, I think Sandy is right. I only get JIRA email for issues
>> I'm watching.
>>
>> Nick
>>
>> On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza 
>> wrote:
>>
>>> JIRA updates don't go to this list, they go to iss...@spark.apache.org.
>>> I
>>> don't think many are signed up for that list, and those that are probably
>>> have a flood of emails anyway.
>>>
>>> So I'd definitely be in favor of any JIRA cleanup that you're up for.
>>>
>>> -Sandy
>>>
>>> On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen  wrote:
>>>
>>> > I've wasted no time in wielding the commit bit to complete a number of
>>> > small, uncontroversial changes. I wouldn't commit anything that didn't
>>> > already appear to have review, consensus and little risk, but please
>>> > let me know if anything looked a little too bold, so I can calibrate.
>>> >
>>> >
>>> > Anyway, I'd like to continue some small house-cleaning by improving
>>> > the state of JIRA's metadata, in order to let it give us a little
>>> > clearer view on what's happening in the project:
>>> >
>>> > a. Add Component to every (open) issue that's missing one
>>> > b. Review all Critical / Blocker issues to de-escalate ones that seem
>>> > obviously neither
>>> > c. Correct open issues that list a Fix version that has already been
>>> > released
>>> > d. Close all issues Resolved for a release that has already been
>>> released
>>> >
>>> > The problem with doing so is that it will create a tremendous amount
>>> > of email to the list, like, several hundred. It's possible to make
>>> > bulk changes and suppress e-mail though, which could be done for all
>>> > but b.
>>> >
>>> > Better to suppress the emails when making such changes? or just not
>>> > bother on some of these?
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: dev-h...@spark.apache.org
>>> >
>>> >
>>>
>>

Re: Improving metadata in Spark JIRA

2015-02-06 Thread Nicholas Chammas
Do we need some new components to be added to the JIRA project?

Like:

   -

   scheduler
-

   YARN
- spark-submit
   - …?

Nick
​

On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> +9000 on cleaning up JIRA.
>
> Thank you Sean for laying out some specific things to tackle. I will
> assist with this.
>
> Regarding email, I think Sandy is right. I only get JIRA email for issues
> I'm watching.
>
> Nick
>
> On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza 
> wrote:
>
>> JIRA updates don't go to this list, they go to iss...@spark.apache.org.
>> I
>> don't think many are signed up for that list, and those that are probably
>> have a flood of emails anyway.
>>
>> So I'd definitely be in favor of any JIRA cleanup that you're up for.
>>
>> -Sandy
>>
>> On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen  wrote:
>>
>> > I've wasted no time in wielding the commit bit to complete a number of
>> > small, uncontroversial changes. I wouldn't commit anything that didn't
>> > already appear to have review, consensus and little risk, but please
>> > let me know if anything looked a little too bold, so I can calibrate.
>> >
>> >
>> > Anyway, I'd like to continue some small house-cleaning by improving
>> > the state of JIRA's metadata, in order to let it give us a little
>> > clearer view on what's happening in the project:
>> >
>> > a. Add Component to every (open) issue that's missing one
>> > b. Review all Critical / Blocker issues to de-escalate ones that seem
>> > obviously neither
>> > c. Correct open issues that list a Fix version that has already been
>> > released
>> > d. Close all issues Resolved for a release that has already been
>> released
>> >
>> > The problem with doing so is that it will create a tremendous amount
>> > of email to the list, like, several hundred. It's possible to make
>> > bulk changes and suppress e-mail though, which could be done for all
>> > but b.
>> >
>> > Better to suppress the emails when making such changes? or just not
>> > bother on some of these?
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: dev-h...@spark.apache.org
>> >
>> >
>>
>


Re: Improving metadata in Spark JIRA

2015-02-06 Thread Nicholas Chammas
+9000 on cleaning up JIRA.

Thank you Sean for laying out some specific things to tackle. I will assist
with this.

Regarding email, I think Sandy is right. I only get JIRA email for issues
I'm watching.

Nick

On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza  wrote:

> JIRA updates don't go to this list, they go to iss...@spark.apache.org.  I
> don't think many are signed up for that list, and those that are probably
> have a flood of emails anyway.
>
> So I'd definitely be in favor of any JIRA cleanup that you're up for.
>
> -Sandy
>
> On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen  wrote:
>
> > I've wasted no time in wielding the commit bit to complete a number of
> > small, uncontroversial changes. I wouldn't commit anything that didn't
> > already appear to have review, consensus and little risk, but please
> > let me know if anything looked a little too bold, so I can calibrate.
> >
> >
> > Anyway, I'd like to continue some small house-cleaning by improving
> > the state of JIRA's metadata, in order to let it give us a little
> > clearer view on what's happening in the project:
> >
> > a. Add Component to every (open) issue that's missing one
> > b. Review all Critical / Blocker issues to de-escalate ones that seem
> > obviously neither
> > c. Correct open issues that list a Fix version that has already been
> > released
> > d. Close all issues Resolved for a release that has already been released
> >
> > The problem with doing so is that it will create a tremendous amount
> > of email to the list, like, several hundred. It's possible to make
> > bulk changes and suppress e-mail though, which could be done for all
> > but b.
> >
> > Better to suppress the emails when making such changes? or just not
> > bother on some of these?
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>


Re: Improving metadata in Spark JIRA

2015-02-06 Thread Sandy Ryza
JIRA updates don't go to this list, they go to iss...@spark.apache.org.  I
don't think many are signed up for that list, and those that are probably
have a flood of emails anyway.

So I'd definitely be in favor of any JIRA cleanup that you're up for.

-Sandy

On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen  wrote:

> I've wasted no time in wielding the commit bit to complete a number of
> small, uncontroversial changes. I wouldn't commit anything that didn't
> already appear to have review, consensus and little risk, but please
> let me know if anything looked a little too bold, so I can calibrate.
>
>
> Anyway, I'd like to continue some small house-cleaning by improving
> the state of JIRA's metadata, in order to let it give us a little
> clearer view on what's happening in the project:
>
> a. Add Component to every (open) issue that's missing one
> b. Review all Critical / Blocker issues to de-escalate ones that seem
> obviously neither
> c. Correct open issues that list a Fix version that has already been
> released
> d. Close all issues Resolved for a release that has already been released
>
> The problem with doing so is that it will create a tremendous amount
> of email to the list, like, several hundred. It's possible to make
> bulk changes and suppress e-mail though, which could be done for all
> but b.
>
> Better to suppress the emails when making such changes? or just not
> bother on some of these?
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Improving metadata in Spark JIRA

2015-02-06 Thread Sean Owen
I've wasted no time in wielding the commit bit to complete a number of
small, uncontroversial changes. I wouldn't commit anything that didn't
already appear to have review, consensus and little risk, but please
let me know if anything looked a little too bold, so I can calibrate.


Anyway, I'd like to continue some small house-cleaning by improving
the state of JIRA's metadata, in order to let it give us a little
clearer view on what's happening in the project:

a. Add Component to every (open) issue that's missing one
b. Review all Critical / Blocker issues to de-escalate ones that seem
obviously neither
c. Correct open issues that list a Fix version that has already been released
d. Close all issues Resolved for a release that has already been released

The problem with doing so is that it will create a tremendous amount
of email to the list, like, several hundred. It's possible to make
bulk changes and suppress e-mail though, which could be done for all
but b.

Better to suppress the emails when making such changes? or just not
bother on some of these?

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Data source API | sizeInBytes should be to *Scan

2015-02-06 Thread Aniket Bhatnagar
Hi Spark SQL committers

I have started experimenting with data sources API and I was wondering if
it makes sense to move the method sizeInBytes from BaseRelation to Scan
interfaces. This is because that a relation may be able to leverage filter
push down to estimate size potentially making a very large relation
broadcast-able. Thoughts?

Aniket