Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

2011-07-06 Thread Greg Stein
This is cool. I have one basic question: do you want the latest
content, or do you want full history?

If latest content, then you could use svn export rather than svn
checkout. However, export won't pick up changes from upstream. The
script would need to skip existing directories, rather than update
them.

If you want history, then we'll want to use something like svnsync to
copy history into local repositories. (or svnrdump from the upcoming
1.7 release)

Thoughts?

Cheers,
-g

On Sun, Jul 3, 2011 at 21:07, Dave Fisher dave2w...@comcast.net wrote:
 This is a script and text file for fetching and maintaining an svn checkout 
 of many OOo project's Kenai webcontent.

 I followed the same pattern for the script and text file as Greg did for the 
 CWS Mercurial pulls.

 dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
  './projects' exists. Updating ...
 At revision 3.
  './www' exists. Updating ...
 At revision 53.
  './download' exists. Updating ...
 At revision 296.
  './development' exists. Updating ...
 At revision 15.

 Regards,
 Dave

 On Jul 3, 2011, at 5:48 PM, w...@apache.org wrote:

 Author: wave
 Date: Mon Jul  4 00:48:01 2011
 New Revision: 1142528

 URL: http://svn.apache.org/viewvc?rev=1142528view=rev
 Log:
 A script for pulling webcontent from Kenai's svn repos plus the start to the 
 web-project list. The script follows the pattern of fetch-all-cws.sh. It is 
 a similar process.

 Added:
    incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
    incubator/ooo/trunk/tools/dev/web-list.txt   (with props)

 Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 URL: 
 http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528view=auto
 ==
 --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
 +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
 @@ -0,0 +1,81 @@
 +#!/bin/sh
 +#
 +# Licensed to the Apache Software Foundation (ASF) under one
 +# or more contributor license agreements.  See the NOTICE file
 +# distributed with this work for additional information
 +# regarding copyright ownership.  The ASF licenses this file
 +# to you under the Apache License, Version 2.0 (the
 +# License); you may not use this file except in compliance
 +# with the License.  You may obtain a copy of the License at
 +#
 +#   http://www.apache.org/licenses/LICENSE-2.0
 +#
 +# Unless required by applicable law or agreed to in writing,
 +# software distributed under the License is distributed on an
 +# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 +# KIND, either express or implied.  See the License for the
 +# specific language governing permissions and limitations
 +# under the License.
 +#
 +
 +#
 +# Use this script to fetch all a project's webcontent for the projects
 +# listed in the specified file (typically, webcontent-list.txt).
 +#
 +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
 +# for a note on the checkout from the Kenai svn repository.
 +#
 +# USAGE:
 +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
 +#
 +#     WEB-LIST is a file containing the list of Projects to fetch
 +#       (see the file tools/dev/webcontent-list.txt)
 +#     WORK-DIR each project's webcontent will be created in a
 +#       subdirectory of WORK-DIR
 +#
 +#  Future steps will include scripts to transform the content for
 +#  the Apache CMS or a Confluence Wiki import
 +#
 +
 +if test $# != 2; then
 +  echo USAGE: $0 WEB-LIST WORK-DIR
 +  exit 1
 +fi
 +
 +REPOS='https://svn.openoffice.org/svn/'
 +REPOS2='~webcontent'
 +
 +# Make the work directory, in case it does not exist
 +if test ! -e $2; then
 +  mkdir $2
 +fi
 +
 +# Turn the parameters into absolute paths
 +work=`(cd $2 ; pwd)`
 +
 +webdir=`dirname $1`
 +webfile=`basename $1`
 +weblist=`(cd $webdir ; pwd)`/$webfile
 +
 +
 +for webproject in `grep '^./' $weblist` ; do
 +  cd $work
 +
 +  webrepos=${REPOS}${webproject}${REPOS2}
 +
 +  if test -d $webproject ; then
 +    echo  '$project' exists. Updating ...
 +    cd $webproject
 +    svn update
 +
 +  elif test -e $webproject ; then
 +    echo ERROR: '$webproject' exists and is not a directory.
 +    exit 1
 +
 +  # filter out empty CWS: hg incoming returns 1 if there's nothing to pull
 +  else
 +    echo  '$webproject' is being created ...
 +    svn co $webrepos $webproject
 +  fi
 +
 +done

 Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 --
    svn:eol-style = native

 Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 --
    svn:executable = *

 Added: incubator/ooo/trunk/tools/dev/web-list.txt
 URL: 
 http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/web-list.txt?rev=1142528view=auto
 

Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

2011-07-06 Thread Dave Fisher

On Jul 6, 2011, at 10:42 AM, Greg Stein wrote:

 This is cool. I have one basic question: do you want the latest
 content, or do you want full history?

Well. I think that history was erased with the conversion to Kenai as I am 
seeing most everything in the initial revision of when they were imported. 
There is a clear revision 1 for most of www.

So, yes history will be important, but there is not a whole lot.

 
 If latest content, then you could use svn export rather than svn
 checkout. However, export won't pick up changes from upstream. The
 script would need to skip existing directories, rather than update
 them.

I followed the pattern with your hg script, so that an aborted checkout could 
be restarted at any point and be successful. Some of these projects are huge.

For now we are going to work with these and they may change (Kay Schenk has 
commit rights over in OOo/Kenai.)

http://openoffice.org/projects/www/sources/webcontent/show

 
 If you want history, then we'll want to use something like svnsync to
 copy history into local repositories. (or svnrdump from the upcoming
 1.7 release)
 
 Thoughts?

Not sure. I'd like to know what those close to the OOo content think about this.

I'm not sure if we will do things like this:

(1) Copy the webcontent into the project's svn as data.

(2) Transform it into the format that we want to maintain in the ApacheCMS also 
in svn.

(3) Add the proper extensions and views to the Apache CMS and our local lib to
(a) create html web content.
(b) create odf content.
(c) create pdf content

or

(1) Copy the webcontent into a scratch area and transform it into a directory 
structure to be imported into svn.

(2)  Add the proper extensions and views to the Apache CMS and our local lib to
(a) create html web content.
(b) create odf content.
(c) create pdf content


I was thinking of (1)/(2) until we get a better handle on the process.

But maybe it would be best to get everything into the archive.

If we do (1)(2)(3) then we'll need to schedule a weekend process with Infra. 
Correct? When will svnrdump be available and will it work with OOo/Kenai? What 
does that process look like.

Regards,
Dave

 
 Cheers,
 -g
 
 On Sun, Jul 3, 2011 at 21:07, Dave Fisher dave2w...@comcast.net wrote:
 This is a script and text file for fetching and maintaining an svn checkout 
 of many OOo project's Kenai webcontent.
 
 I followed the same pattern for the script and text file as Greg did for the 
 CWS Mercurial pulls.
 
 dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
  './projects' exists. Updating ...
 At revision 3.
  './www' exists. Updating ...
 At revision 53.
  './download' exists. Updating ...
 At revision 296.
  './development' exists. Updating ...
 At revision 15.
 
 Regards,
 Dave
 
 On Jul 3, 2011, at 5:48 PM, w...@apache.org wrote:
 
 Author: wave
 Date: Mon Jul  4 00:48:01 2011
 New Revision: 1142528
 
 URL: http://svn.apache.org/viewvc?rev=1142528view=rev
 Log:
 A script for pulling webcontent from Kenai's svn repos plus the start to 
 the web-project list. The script follows the pattern of fetch-all-cws.sh. 
 It is a similar process.
 
 Added:
incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
incubator/ooo/trunk/tools/dev/web-list.txt   (with props)
 
 Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 URL: 
 http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528view=auto
 ==
 --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
 +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
 @@ -0,0 +1,81 @@
 +#!/bin/sh
 +#
 +# Licensed to the Apache Software Foundation (ASF) under one
 +# or more contributor license agreements.  See the NOTICE file
 +# distributed with this work for additional information
 +# regarding copyright ownership.  The ASF licenses this file
 +# to you under the Apache License, Version 2.0 (the
 +# License); you may not use this file except in compliance
 +# with the License.  You may obtain a copy of the License at
 +#
 +#   http://www.apache.org/licenses/LICENSE-2.0
 +#
 +# Unless required by applicable law or agreed to in writing,
 +# software distributed under the License is distributed on an
 +# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 +# KIND, either express or implied.  See the License for the
 +# specific language governing permissions and limitations
 +# under the License.
 +#
 +
 +#
 +# Use this script to fetch all a project's webcontent for the projects
 +# listed in the specified file (typically, webcontent-list.txt).
 +#
 +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
 +# for a note on the checkout from the Kenai svn repository.
 +#
 +# USAGE:
 +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
 +#
 +# WEB-LIST is a file containing the 

Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

2011-07-03 Thread Dave Fisher
This is a script and text file for fetching and maintaining an svn checkout of 
many OOo project's Kenai webcontent.

I followed the same pattern for the script and text file as Greg did for the 
CWS Mercurial pulls.

dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
 './projects' exists. Updating ...
At revision 3.
 './www' exists. Updating ...
At revision 53.
 './download' exists. Updating ...
At revision 296.
 './development' exists. Updating ...
At revision 15.

Regards,
Dave

On Jul 3, 2011, at 5:48 PM, w...@apache.org wrote:

 Author: wave
 Date: Mon Jul  4 00:48:01 2011
 New Revision: 1142528
 
 URL: http://svn.apache.org/viewvc?rev=1142528view=rev
 Log:
 A script for pulling webcontent from Kenai's svn repos plus the start to the 
 web-project list. The script follows the pattern of fetch-all-cws.sh. It is a 
 similar process.
 
 Added:
incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
incubator/ooo/trunk/tools/dev/web-list.txt   (with props)
 
 Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 URL: 
 http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528view=auto
 ==
 --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
 +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
 @@ -0,0 +1,81 @@
 +#!/bin/sh
 +#
 +# Licensed to the Apache Software Foundation (ASF) under one
 +# or more contributor license agreements.  See the NOTICE file
 +# distributed with this work for additional information
 +# regarding copyright ownership.  The ASF licenses this file
 +# to you under the Apache License, Version 2.0 (the
 +# License); you may not use this file except in compliance
 +# with the License.  You may obtain a copy of the License at
 +#
 +#   http://www.apache.org/licenses/LICENSE-2.0
 +#
 +# Unless required by applicable law or agreed to in writing,
 +# software distributed under the License is distributed on an
 +# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 +# KIND, either express or implied.  See the License for the
 +# specific language governing permissions and limitations
 +# under the License.
 +#
 +
 +#
 +# Use this script to fetch all a project's webcontent for the projects
 +# listed in the specified file (typically, webcontent-list.txt).
 +#
 +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
 +# for a note on the checkout from the Kenai svn repository.
 +#
 +# USAGE:
 +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
 +#
 +# WEB-LIST is a file containing the list of Projects to fetch
 +#   (see the file tools/dev/webcontent-list.txt)
 +# WORK-DIR each project's webcontent will be created in a
 +#   subdirectory of WORK-DIR
 +#
 +#  Future steps will include scripts to transform the content for
 +#  the Apache CMS or a Confluence Wiki import
 +#
 +
 +if test $# != 2; then
 +  echo USAGE: $0 WEB-LIST WORK-DIR
 +  exit 1
 +fi
 +
 +REPOS='https://svn.openoffice.org/svn/'
 +REPOS2='~webcontent'
 +
 +# Make the work directory, in case it does not exist
 +if test ! -e $2; then
 +  mkdir $2
 +fi
 +
 +# Turn the parameters into absolute paths
 +work=`(cd $2 ; pwd)`
 +
 +webdir=`dirname $1`
 +webfile=`basename $1`
 +weblist=`(cd $webdir ; pwd)`/$webfile
 +
 +
 +for webproject in `grep '^./' $weblist` ; do
 +  cd $work
 +
 +  webrepos=${REPOS}${webproject}${REPOS2}
 +
 +  if test -d $webproject ; then
 +echo  '$project' exists. Updating ...
 +cd $webproject
 +svn update
 +
 +  elif test -e $webproject ; then
 +echo ERROR: '$webproject' exists and is not a directory.
 +exit 1
 +
 +  # filter out empty CWS: hg incoming returns 1 if there's nothing to pull
 +  else
 +echo  '$webproject' is being created ...
 +svn co $webrepos $webproject
 +  fi
 +
 +done
 
 Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 --
svn:eol-style = native
 
 Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
 --
svn:executable = *
 
 Added: incubator/ooo/trunk/tools/dev/web-list.txt
 URL: 
 http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/web-list.txt?rev=1142528view=auto
 ==
 --- incubator/ooo/trunk/tools/dev/web-list.txt (added)
 +++ incubator/ooo/trunk/tools/dev/web-list.txt Mon Jul  4 00:48:01 2011
 @@ -0,0 +1,35 @@
 +#
 +# Licensed to the Apache Software Foundation (ASF) under one
 +# or more contributor license agreements.  See the NOTICE file
 +# distributed with this work for additional information
 +# regarding copyright ownership.  The ASF licenses this file
 +# to you under the Apache License, Version 2.0 (the
 +# License); you may not use this file