ooo large commit breaks mirror sync !!

2012-02-20 Thread Gavin McDonald
If anyone wants to do large commits like below link shows [1] (over 8000
paths changed and lots of zips and tar files.)

PLEASE notify infra first and schedule it for a WEEKEND.

People have been struggling to commit to the EU mirror since this large
commit was started 5 hours ago.

Thanks

Gav...

[1] -
http://svn.apache.org/viewvc?view=revision&sortby=rev&sortdir=down&revision=
1291394





Re: ooo large commit breaks mirror sync !!

2012-02-21 Thread Armin Le Grand

Hi Gavin,

On 21.02.2012 00:21, Gavin McDonald wrote:

If anyone wants to do large commits like below link shows [1] (over 8000
paths changed and lots of zips and tar files.)

PLEASE notify infra first and schedule it for a WEEKEND.


Sorry for that. All I did was updating a branch to current trunk. I was 
also surprised that in that one week I wanted to get updates for sooo 
many files were touched, obviously mainly flag changes.


If updating a simple work branch can lead to this, something is not 
optimal and we should think about it. It's not really an alternative 
when working with svn on a simple branch (where only single files were 
changed) to wait until the weekend and to notify someone to continue 
working, esp. when you want to use that branch to get the code to 
different build machines on various OSes or various colleagues.


We have svn branches as mechanism for that, I do not want to go back, 
create diffs and sync repos on different machines per hand, this cannot 
be the solution, IMHO.


I know - when looking at it now - technically I could have extracted a 
diff from my branch, throw the branch away, create a new one from trunk 
and apply the diff. But this would have required to know aforehand how 
many changes were done and that the commit would be extreme. It's also 
not a good solution when you are committed to a project which works with 
a code versioning system.


Please point me to a solution which would be compliant with svn usage 
and infra and I'll surely use it next time.


From my POV it also shows that the mechanism to update svn branches is 
not optimal; all those binary files which were committed and thus 
transferred again were already on the trunk, so technically it may be 
time to think about a more effective way to update branches with svn. 
Creating a branch does a 'cheap copy' (CopyOnWrite - COW), so I would 
have expected that updating would somehow try to stay with this and not 
transfer all files again. Maybe it would be possible (in the commit step 
of updating branches) to send a checksum of a file first, see if it's 
the same as on trunk and add it as COW-copy...



People have been struggling to commit to the EU mirror since this large
commit was started 5 hours ago.

Thanks

Gav...

[1] -
http://svn.apache.org/viewvc?view=revision&sortby=rev&sortdir=down&revision=
1291394



Sincerely,
Armin
--
ALG



Re: ooo large commit breaks mirror sync !!

2012-02-21 Thread Dave Fisher
Hi Armin,

 U   
branches/alg/install/ext_sources/48470d662650c3c074e1c3fabbc67bbd-README_source-9.0.0.7-bj.txt
 U   
branches/alg/install/ext_sources/3b179ed18f65c43141528aa6d2440db4-serf-1.0.0.tar.bz2
 U   
branches/alg/install/ext_sources/2c9b0f83ed5890af02c0df1c1776f39b-commons-httpclient-3.1-src.tar.gz
 U   
branches/alg/install/ext_sources/48a9f787f43a09c0a9b7b00cd1fddbbf-hyphen-2.7.1.tar.gz
 U   
branches/alg/install/ext_sources/bc702168a2af16869201dbe91e46ae48-LICENSE_Python-2.6.1
...

...
 U   
branches/alg/install/ext_sources/c441926f3a552ed3e5b274b62e86af16-STLport-4.0.tar.gz

Is it necessary to have all these external source tarballs in the branch? These 
files are kept outside of trunk for a reason - to avoid sledgehammers on 
changes.

Consider using an svn tag on ext_source tarballs in your builds.

Regards,
Dave

On Feb 21, 2012, at 3:11 AM, Armin Le Grand wrote:

>   Hi Gavin,
> 
> On 21.02.2012 00:21, Gavin McDonald wrote:
>> If anyone wants to do large commits like below link shows [1] (over 8000
>> paths changed and lots of zips and tar files.)
>> 
>> PLEASE notify infra first and schedule it for a WEEKEND.
> 
> Sorry for that. All I did was updating a branch to current trunk. I was also 
> surprised that in that one week I wanted to get updates for sooo many files 
> were touched, obviously mainly flag changes.
> 
> If updating a simple work branch can lead to this, something is not optimal 
> and we should think about it. It's not really an alternative when working 
> with svn on a simple branch (where only single files were changed) to wait 
> until the weekend and to notify someone to continue working, esp. when you 
> want to use that branch to get the code to different build machines on 
> various OSes or various colleagues.
> 
> We have svn branches as mechanism for that, I do not want to go back, create 
> diffs and sync repos on different machines per hand, this cannot be the 
> solution, IMHO.
> 
> I know - when looking at it now - technically I could have extracted a diff 
> from my branch, throw the branch away, create a new one from trunk and apply 
> the diff. But this would have required to know aforehand how many changes 
> were done and that the commit would be extreme. It's also not a good solution 
> when you are committed to a project which works with a code versioning system.
> 
> Please point me to a solution which would be compliant with svn usage and 
> infra and I'll surely use it next time.
> 
> From my POV it also shows that the mechanism to update svn branches is not 
> optimal; all those binary files which were committed and thus transferred 
> again were already on the trunk, so technically it may be time to think about 
> a more effective way to update branches with svn. Creating a branch does a 
> 'cheap copy' (CopyOnWrite - COW), so I would have expected that updating 
> would somehow try to stay with this and not transfer all files again. Maybe 
> it would be possible (in the commit step of updating branches) to send a 
> checksum of a file first, see if it's the same as on trunk and add it as 
> COW-copy...
> 
>> People have been struggling to commit to the EU mirror since this large
>> commit was started 5 hours ago.
>> 
>> Thanks
>> 
>> Gav...
>> 
>> [1] -
>> http://svn.apache.org/viewvc?view=revision&sortby=rev&sortdir=down&revision=
>> 1291394
>> 
> 
> Sincerely,
>   Armin
> --
> ALG
> 



Re: ooo large commit breaks mirror sync !!

2012-02-21 Thread Pedro Giffuni

On 02/21/12 12:46, Dave Fisher wrote:

Hi Armin,

  U   
branches/alg/install/ext_sources/48470d662650c3c074e1c3fabbc67bbd-README_source-9.0.0.7-bj.txt
  U   
branches/alg/install/ext_sources/3b179ed18f65c43141528aa6d2440db4-serf-1.0.0.tar.bz2
  U   
branches/alg/install/ext_sources/2c9b0f83ed5890af02c0df1c1776f39b-commons-httpclient-3.1-src.tar.gz
  U   
branches/alg/install/ext_sources/48a9f787f43a09c0a9b7b00cd1fddbbf-hyphen-2.7.1.tar.gz
  U   
branches/alg/install/ext_sources/bc702168a2af16869201dbe91e46ae48-LICENSE_Python-2.6.1
...

...
  U   
branches/alg/install/ext_sources/c441926f3a552ed3e5b274b62e86af16-STLport-4.0.tar.gz

Is it necessary to have all these external source tarballs in the branch? These 
files are kept outside of trunk for a reason - to avoid sledgehammers on 
changes.


I have always been of the idea of keeping them outside the repository.

I think that sourceforge will provide some extra space in the extensions
site where we could keep them... but it's just an idea for later.

Pedro.



Re: ooo large commit breaks mirror sync !!

2012-02-21 Thread Dave Fisher

On Feb 21, 2012, at 12:59 PM, Pedro Giffuni wrote:

> On 02/21/12 12:46, Dave Fisher wrote:
>> Hi Armin,
>> 
>>  U   
>> branches/alg/install/ext_sources/48470d662650c3c074e1c3fabbc67bbd-README_source-9.0.0.7-bj.txt
>>  U   
>> branches/alg/install/ext_sources/3b179ed18f65c43141528aa6d2440db4-serf-1.0.0.tar.bz2
>>  U   
>> branches/alg/install/ext_sources/2c9b0f83ed5890af02c0df1c1776f39b-commons-httpclient-3.1-src.tar.gz
>>  U   
>> branches/alg/install/ext_sources/48a9f787f43a09c0a9b7b00cd1fddbbf-hyphen-2.7.1.tar.gz
>>  U   
>> branches/alg/install/ext_sources/bc702168a2af16869201dbe91e46ae48-LICENSE_Python-2.6.1
>> ...
>> 
>> ...
>>  U   
>> branches/alg/install/ext_sources/c441926f3a552ed3e5b274b62e86af16-STLport-4.0.tar.gz
>> 
>> Is it necessary to have all these external source tarballs in the branch? 
>> These files are kept outside of trunk for a reason - to avoid sledgehammers 
>> on changes.
> 
> I have always been of the idea of keeping them outside the repository.

They're OK where they are right now. I just think they need to be on a branch. 
The proper action in svn is to tag.

> 
> I think that sourceforge will provide some extra space in the extensions
> site where we could keep them... but it's just an idea for later.

To me, dealing with the build dependencies is a post-release and pre-graduation 
issue. SourceForge might be the place if we need a forked version of a tool, 
but I really think that we should be using the dependent project's distribution 
as much as we can. For example Apache Tomcat 5.5.35 will always be available 
from http://archive.apache.org/dist/tomcat/tomcat-5/v5.5.35/

I'm certain that Rob will come up with the case of an orphaned project or one 
where needed alterations are not taken by the upstream. These are the 
SourceForge or new Incubator project cases.

Here is a list of apparent Apache projects (not intended to be complete)

commons-lang-2.3-src.tar.gz
commons-httpclient-3.1-src.tar.gz
commons-logging-1.1.1-src.tar.gz
commons-codec-1.3-src.tar.gz
lucene-2.3.2.tar.gz
apr-util-1.4.1.tar.gz
apr-1.4.5.tar.gz
apache-tomcat-5.5.35-src.tar.gz

In Apache POI we pull dependencies in an ant build script from maven.

http://repo1.maven.org"/>
...


...




...

I'm just saying that much can be managed outside of the Apache ooo repository. 
The best place is somewhere the dependent project has placed their releases.

For cases of files like the Adobe core AFMs - Adobe-Core35_AFMs-314.tar.gz 
These don't need to be external they are category A[1] and can be in the source 
tree if that makes sense:

Regards,
Dave

[1] https://issues.apache.org/jira/browse/LEGAL-35


> 
> Pedro.
> 



Re: ooo large commit breaks mirror sync !!

2012-02-21 Thread Pedro Giffuni

On 02/21/12 19:08, Dave Fisher wrote:

On Feb 21, 2012, at 12:59 PM, Pedro Giffuni wrote:


On 02/21/12 12:46, Dave Fisher wrote:

Hi Armin,

  U   
branches/alg/install/ext_sources/48470d662650c3c074e1c3fabbc67bbd-README_source-9.0.0.7-bj.txt
  U   
branches/alg/install/ext_sources/3b179ed18f65c43141528aa6d2440db4-serf-1.0.0.tar.bz2
  U   
branches/alg/install/ext_sources/2c9b0f83ed5890af02c0df1c1776f39b-commons-httpclient-3.1-src.tar.gz
  U   
branches/alg/install/ext_sources/48a9f787f43a09c0a9b7b00cd1fddbbf-hyphen-2.7.1.tar.gz
  U   
branches/alg/install/ext_sources/bc702168a2af16869201dbe91e46ae48-LICENSE_Python-2.6.1
...

...
  U   
branches/alg/install/ext_sources/c441926f3a552ed3e5b274b62e86af16-STLport-4.0.tar.gz

Is it necessary to have all these external source tarballs in the branch? These 
files are kept outside of trunk for a reason - to avoid sledgehammers on 
changes.

I have always been of the idea of keeping them outside the repository.

They're OK where they are right now. I just think they need to be on a branch. 
The proper action in svn is to tag.


The recent issue in SVN was due to incorrect properties when uploading 
the initial image of the repository: I don't see how tagging could have 
prevented it but thankfully I don't think changing the properties will 
be necessary in the future.



I think that sourceforge will provide some extra space in the extensions
site where we could keep them... but it's just an idea for later.

To me, dealing with the build dependencies is a post-release and pre-graduation 
issue. SourceForge might be the place if we need a forked version of a tool, 
but I really think that we should be using the dependent project's distribution 
as much as we can. For example Apache Tomcat 5.5.35 will always be available 
from http://archive.apache.org/dist/tomcat/tomcat-5/v5.5.35/

I'm certain that Rob will come up with the case of an orphaned project or one 
where needed alterations are not taken by the upstream. These are the 
SourceForge or new Incubator project cases.


The versions of icc and saxon that we carry are obsolete and are not 
available anywhere but it's just a matter of updating them. I don't 
think we carry orphaned projects proper.




In Apache POI we pull dependencies in an ant build script from maven.

 http://repo1.maven.org"/>
...
 
 
...
 
 
 
 
...

I'm just saying that much can be managed outside of the Apache ooo repository. 
The best place is somewhere the dependent project has placed their releases.


Yes, that's essentially what FreeBSD ports and all the subsequent linux
reinventions (Debian, Gentoo, etc)  do. Subversion is not
a good place for holding them though.


For cases of files like the Adobe core AFMs - Adobe-Core35_AFMs-314.tar.gz 
These don't need to be external they are category A[1] and can be in the source 
tree if that makes sense:


For an alternative approach look at the instructions in 
main/stax/download/ .


cheers,

Pedro.


Re: ooo large commit breaks mirror sync !!

2012-04-11 Thread Armin Le Grand

Hi Gavin,

On 21.02.2012 00:21, Gavin McDonald wrote:

If anyone wants to do large commits like below link shows [1] (over 8000
paths changed and lots of zips and tar files.)

PLEASE notify infra first and schedule it for a WEEKEND.

People have been struggling to commit to the EU mirror since this large
commit was started 5 hours ago.

Thanks

Gav...

[1] -
http://svn.apache.org/viewvc?view=revision&sortby=rev&sortdir=down&revision=
1291394


sorry to come back to this one, but I've re-synched and built aw080 
branch 
(https://svn.apache.org/repos/asf/incubator/ooo/branches/alg/aw080) 
which contains all the changes which lead to the first problems. Worse, 
this time even more changes are involved.


I see no other chance to resync my branch (containing 1 1/2 years of 
work on OOo code) than to commit it to the branch. Please give advice 
when and how to do that, I'll be ready to commit in the next days.


Sincerely,
Armin
--
ALG