Re: [Spacewalk-devel] [PATCH] Filters on reposync

2011-10-19 Thread Baptiste AGASSE
Hi Michael,

Thanks a lot for your answer.

Regards.

Baptiste

- Mail original -
De: Michael Mraka michael.mr...@redhat.com
À: spacewalk-devel@redhat.com
Envoyé: Mardi 18 Octobre 2011 13:51:16
Objet: Re: [Spacewalk-devel] [PATCH] Filters on reposync

Baptiste AGASSE wrote:
% Hi,
% 
% I've just looked at reposync.py code a I don't see any part related to
% getting include/exclude filters from database as properties of
% repositories that Jan asked me when i've submited  some code for
% filters (he said that he didn't like that filters are specified at
% runtime).
% 
% Any news on it ?  If, you decide to store filters in database, these
% will be accessible through repositories webui ?

Hi Baptiste,

yesterday and today I finally saved some time to look into it again.  I
added a database table to hold include/exclude filters and
spacewalk-repo-sync code which is able to use it. Unfortunately there's
not webUI part yet so if you want to check it you have to insert filter
data directy into database.  E.g.

  insert into rhncontentsourcefilter (id, source_id, sort_order, flag, filter)
values (sequence_nextval('rhn_csf_id_seq') , 536, 1, '+', 'samba-client');

where source_id is repository id (that's the end part as in
https://spacewalk/rhn/channels/manage/repos/RepoEdit.do?id=536 url),
sort_order specifies the order of filters which belong to the same repo,
flag is either '+' or '-' (include / exclude) and filter is the list of
packages to be filtered.


Regards.


--
Michael Mráka
Satellite Engineering, Red Hat

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

Re: [Spacewalk-devel] [PATCH] Filters on reposync

2011-10-18 Thread Michael Mraka
Baptiste AGASSE wrote:
% Hi,
% 
% I've just looked at reposync.py code a I don't see any part related to
% getting include/exclude filters from database as properties of
% repositories that Jan asked me when i've submited  some code for
% filters (he said that he didn't like that filters are specified at
% runtime).
% 
% Any news on it ?  If, you decide to store filters in database, these
% will be accessible through repositories webui ?

Hi Baptiste,

yesterday and today I finally saved some time to look into it again.  I
added a database table to hold include/exclude filters and
spacewalk-repo-sync code which is able to use it. Unfortunately there's
not webUI part yet so if you want to check it you have to insert filter
data directy into database.  E.g.

  insert into rhncontentsourcefilter (id, source_id, sort_order, flag, filter)
values (sequence_nextval('rhn_csf_id_seq') , 536, 1, '+', 'samba-client');

where source_id is repository id (that's the end part as in
https://spacewalk/rhn/channels/manage/repos/RepoEdit.do?id=536 url),
sort_order specifies the order of filters which belong to the same repo,
flag is either '+' or '-' (include / exclude) and filter is the list of
packages to be filtered.


Regards.


--
Michael Mráka
Satellite Engineering, Red Hat

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

Re: [Spacewalk-devel] [PATCH] Filters on reposync

2011-09-13 Thread Baptiste AGASSE
Hi Michael,

 my original idea was to be able to stack includes / excludes on top of
 each other, e.g.
 include=openoffice.org-*
 exclude=openoffice.org-draw,openoffice.org-langpack*
 include=openoffice.org-langpack-en,openoffice.org-langpack-de
 which in the end will sync openoffice.org packages except *-draw and
 *-langpacks but will also include  *langpack-en and *langpack-de.

Ok, i know that my solution don't care about include/exclude order and can't do 
this, but i can take a look on it

 The filtering and dependecy resolver parts look good. I'm not keen on
 inheriting ContentSource from yum.YumBase because it leads to implicit
 import of all yum repos defined in the server's /etc/yum.repo.d/ which have
 nothing to do with repos defined in Spacewalk (the application). Morever
 you have to delete them so they not mess up with the valid spacewalk
 repos.

Yes, i haven't found many resources on it, i did the best i could to use yum 
dependency resolver

 % - All versions of the packages excluded by a filter is now deleted
 % from DB and filesystem

 Well, this changes current behavior of spacewalk-repo-sync because we
 never remove packages which are already in the channel during sync
 process. E.g. you might have packages pushed manually (via rhnpush) into
 channel, should they also be removed? I don't think so.
 I decided skip this part of patch for now.

Yes, in the patch that I've submitted, include/exclude is in repository scope: 
only excluded packages from current repository are deleted

Regards.

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel


Re: [Spacewalk-devel] [PATCH] Filters on reposync

2011-08-23 Thread Baptiste AGASSE
Hi Jan,

Following your advices, i modified my code:
Now, it get include and exclude filters from database:

I ran the following SQL query on my Oracle XE Database:
ALTER TABLE rhnContentSource ADD (include_filter VARCHAR(255), exclude_filter 
VARCHAR(255));

Regards.

Baptiste

- Mail original -
De: Baptiste AGASSE baptiste.aga...@lyra-network.com
À: spacewalk-devel@redhat.com
Envoyé: Jeudi 18 Août 2011 22:50:34
Objet: Re: [Spacewalk-devel] [PATCH] Filters on reposync

Hi Jan,
Ok, i can take a look on this next week.

Regards.

Baptiste

- Mail original -
De: Jan Pazdziora jpazdzi...@redhat.com
À: spacewalk-devel@redhat.com
Envoyé: Jeudi 18 Août 2011 13:39:57
Objet: Re: [Spacewalk-devel] [PATCH] Filters on reposync

On Wed, Aug 17, 2011 at 08:23:36PM +0200, Baptiste AGASSE wrote:
 
 Following your advices I have modified my code:
 - You can now include and / or exclude packages (with --include and / or 
 --exclude options)
 - Include filter takes priority over exclude filter: if one package meet 
 'include' and 'exclude' rules, it will be included
   eg:
 exclude = [ 'openoffice.org-langpack-*', ...]
 include = [ 'openoffice.org-langpack-en-*', ...]
 
 - Package filtering is in yum_src.py
 - Yum dependencies resolver is now used to find selected packages dependencies
 - All versions of the packages excluded by a filter is now deleted from DB 
 and filesystem
 - Print elapsed time at end of sync
 
 Any comments are welcome.

I don't really like the fact that the exclude/include options are
specified on the spacewalk-repo-sync runtime, rather than being
properties of the repository. In other words -- these exclude/include
lists should be specified in the database, so that they would be used
any time spacewalk-repo-sync is run, no matter if it is run from the
command line or via scheduled event by taskomatic.

Now that the core functionality is in place, could you amend it some
more and have the lists stored in some database table and used from
there?

-- 
Jan Pazdziora
Principal Software Engineer, Satellite Engineering, Red Hat

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel
diff --git a/backend/satellite_tools/repo_plugins/yum_src.py b/backend/satellite_tools/repo_plugins/yum_src.py
index bfc6161..9871a32 100644
--- a/backend/satellite_tools/repo_plugins/yum_src.py
+++ b/backend/satellite_tools/repo_plugins/yum_src.py
@@ -74,14 +74,15 @@ class YumUpdateMetadata(UpdateMetadata):
 no = self._no_cache.setdefault(file['name'], set())
 no.add(un)
 
-class ContentSource:
+class ContentSource(yum.YumBase):
 url = None
 name = None
-repo = None
 cache_dir = '/var/cache/rhn/reposync/'
-def __init__(self, url, name):
-self.url = url
-self.name = name
+repo_id = None
+filters = {'include': [], 'exclude': []}
+
+def __init__(self, url, name, filters = None ):
+yum.YumBase.__init__(self)
 self._clean_cache(self.cache_dir + name)
 
 # read the proxy configuration in /etc/rhn/rhn.conf
@@ -97,14 +98,14 @@ class ContentSource:
 else:
 self.proxy_url = None
 
-def list_packages(self):
- list packages
-repo = yum.yumRepo.YumRepository(self.name)
-self.repo = repo
+if filters:
+self.filters = filters
+
+repo = yum.yumRepo.YumRepository(name)
 repo.cache = 0
 repo.metadata_expire = 0
-repo.mirrorlist = self.url
-repo.baseurl = [self.url]
+repo.mirrorlist = url
+repo.baseurl = [url]
 repo.basecachedir = self.cache_dir
 if self.proxy_url is not None:
 repo.proxy = self.proxy_url
@@ -113,13 +114,36 @@ class ContentSource:
 warnings.disable()
 repo.baseurlSetup()
 warnings.restore()
-
 repo.setup(False)
-sack = repo.getPackageSack()
-sack.populate(repo, 'metadata', None, 0)
-list = sack.returnPackages()
-to_return = []
-for pack in list:
+repos = self.repos.findRepos('*')
+for rep in repos:
+self.repos.disableRepo(rep.id)
+self.repos.delete(rep.id)
+
+self.repo_id = repo.id
+self.repos.add(repo)
+self.pkgSack = self.repos.getRepo(self.repo_id).getPackageSack()
+self.pkgSack.populate(self.repos.getRepo(self.repo_id), 'metadata', None, 0)
+
+def getName(self):
+return self.repos.getRepo(self.repo_id).id
+  
+def getUrl(self):
+return self.repos.getRepo(self.repo_id).mirrorlist
+
+def getFilters(self):
+return self.filters
+
+def getRepo(self):
+return self.repos.getRepo(self.repo_id)
+
+def list_packages(self):
+ list packages
+return self._list_packages(self.pkgSack.returnPackages())
+
+def

Re: [Spacewalk-devel] [PATCH] Filters on reposync

2011-08-17 Thread Baptiste AGASSE
Hi Michael,

Following your advices I have modified my code:
- You can now include and / or exclude packages (with --include and / or 
--exclude options)
- Include filter takes priority over exclude filter: if one package meet 
'include' and 'exclude' rules, it will be included
  eg:
exclude = [ 'openoffice.org-langpack-*', ...]
include = [ 'openoffice.org-langpack-en-*', ...]

- Package filtering is in yum_src.py
- Yum dependencies resolver is now used to find selected packages dependencies
- All versions of the packages excluded by a filter is now deleted from DB and 
filesystem
- Print elapsed time at end of sync

Any comments are welcome.

Regards.

Baptiste


- Mail original -
De: Michael Mraka michael.mr...@redhat.com
À: spacewalk-devel@redhat.com
Envoyé: Lundi 8 Août 2011 11:45:41
Objet: Re: [Spacewalk-devel] [PATCH] Filters on reposync

Baptiste AGASSE wrote:
% Hi all,
% 
% I've modified /backend/satellite_tools/reposync.py to add filters
% support with dependencies solving (like the --rpm-list option of
% cobbler) to spacewalk reposync command.

Hi Baptiste,

this is good feature, thanks for sharing it.

% It add the --filters option to reposync eg: --filters osad rhncfg-*
% foo* bar

So the --filters 'xyz*' means to include xyz* packages only; It would be
great to have also the opposite option to exclude some packages which can
even stack on each other. E.g.

  spacewalk-repo-sync --exclude=openoffice.org-langpack-* \
  --include=openoffice.org-langpack-en-* ...

% It allow to download only selected packages from a repository in order
% to save disk space, mainly if you use only few packages from it, and
% deletes packages already present on your system (in spacewalk server
% and on the filesystem) that don't meet any filters (only packages with
% the same NVREA)
% 
% TODO: - Search in spacewalk server older versions of packages
% downloaded previously that don't match filters and remove them (I will
% work on it).  - Make it available from web UI in repositories
% management part.
% 
% Maybe someone can work on the web UI and the database schema to make
% this option available directly from web UI ? (i'm not familiar with
% java and oracle DB)
% 
% Any comments are welcome, it's the first time that i'm programming in
% python :).

% diff --git a/backend/satellite_tools/reposync.py 
b/backend/satellite_tools/reposync.py
% index 6834adb..0428bb9 100644
% --- a/backend/satellite_tools/reposync.py
% +++ b/backend/satellite_tools/reposync.py
...
% @@ -132,6 +137,7 @@ class RepoSync:
%  self.parser.add_option('-t', '--type', action='store', dest='type', 
help='The type of repo, currently only yum is supported', default='yum')
%  self.parser.add_option('-f', '--fail', action='store_true', 
dest='fail', default=False , help=If a package import fails, fail the entire 
operation)
%  self.parser.add_option('-q', '--quiet', action='store_true', 
dest='quiet', default=False, help=Print no output, still logs output)
% +self.parser.add_option('-p', '--filters', action='store_true', 
dest='filters', help=Synchronize only the packets that meet the filter and 
their dependencies)

action='store_true' means it's True/False option but you likely want return a
string (list of patterns)

%  return self.parser.parse_args()
%  
%  def load_plugin(self):
% @@ -308,9 +314,43 @@ class RepoSync:
%  self.regen = True
%  
...
% +def filter_packages(self, filters, packages):
% +# Returns 3 lists : selected packages, dependencies, and others
% +selected = []
% +dependencies = []
% +others = []
% +for pack in packages:
% +# Select all packages that match one filter
% +match = False
% +for filter_str in filters:
% +reg = re.compile(^ + filter_str.replace(*,.*) + $)
% +if reg.match(pack.name):

Wouldn't be fnmatch.filter() better/easier to use here?

% +match = True
% +break
% +if match:
...
% +
% +def package_deps(self, package, packages_list):
...

This is yum repo plugin specific code and it would be better to
implement it in the plugin itself, i.e. repo_plugins/yum_src.py in this case.
Moreover it would be better to call yum's internal depsolver and not
reinvent it again.


Regards,

--
Michael Mráka
Satellite Engineering, Red Hat

___
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-develdiff --git a/backend/satellite_tools/repo_plugins/yum_src.py b/backend/satellite_tools/repo_plugins/yum_src.py
index bfc6161..db6ee3b 100644
--- a/backend/satellite_tools/repo_plugins/yum_src.py
+++ b/backend/satellite_tools/repo_plugins/yum_src.py
@@ -74,14 +74,14 @@ class YumUpdateMetadata(UpdateMetadata):
 no = self._no_cache.setdefault(file['name'], set

[Spacewalk-devel] [PATCH] Filters on reposync

2011-08-05 Thread Baptiste AGASSE
Hi all,

I've modified /backend/satellite_tools/reposync.py to add filters support 
with dependencies solving (like the --rpm-list option of cobbler) to 
spacewalk reposync command.

It add the --filters option to reposync
eg: --filters osad rhncfg-* foo* bar

It allow to download only selected packages from a repository in order to save 
disk space, mainly if you use only few packages from it, and deletes packages 
already present on your system (in spacewalk server and on the filesystem) that 
don't meet any filters (only packages with the same NVREA)

TODO: 
 - Search in spacewalk server older versions of packages downloaded previously 
that don't match filters and remove them (I will work on it).
 - Make it available from web UI in repositories management part.

Maybe someone can work on the web UI and the database schema to make this 
option available directly from web UI ? (i'm not familiar with java and oracle 
DB)

Any comments are welcome, it's the first time that i'm programming in python :).

Regards.

Baptiste.

diff --git a/backend/satellite_tools/reposync.py b/backend/satellite_tools/reposync.py
index 6834adb..0428bb9 100644
--- a/backend/satellite_tools/reposync.py
+++ b/backend/satellite_tools/reposync.py
@@ -15,6 +15,7 @@
 #
 import sys, os, time
 import hashlib
+import re
 from datetime import datetime
 import traceback
 from optparse import OptionParser
@@ -44,6 +45,7 @@ class RepoSync:
 fail = False
 quiet = False
 regen = False
+filters = []
 
 def main(self):
 initCFG('server')
@@ -89,6 +91,9 @@ class RepoSync:
 quit = True
 self.error_msg(--channel must be specified)
 
+if options.filters:
+self.filters = options.filters.split()
+
 self.log_msg(\nSync started: %s % (time.asctime(time.localtime(
 self.log_msg(str(sys.argv))
 
@@ -132,6 +137,7 @@ class RepoSync:
 self.parser.add_option('-t', '--type', action='store', dest='type', help='The type of repo, currently only yum is supported', default='yum')
 self.parser.add_option('-f', '--fail', action='store_true', dest='fail', default=False , help=If a package import fails, fail the entire operation)
 self.parser.add_option('-q', '--quiet', action='store_true', dest='quiet', default=False, help=Print no output, still logs output)
+self.parser.add_option('-p', '--filters', action='store_true', dest='filters', help=Synchronize only the packets that meet the filter and their dependencies)
 return self.parser.parse_args()
 
 def load_plugin(self):
@@ -308,9 +314,43 @@ class RepoSync:
 self.regen = True
 
 def import_packages(self, plug, url):
-packages = plug.list_packages()
+repo_packages = plug.list_packages()
 to_process = []
-self.print_msg(Repo  + url +  has  + str(len(packages)) +  packages.)
+to_delete = []
+self.print_msg(Repo  + url +  has  + str(len(repo_packages)) +  packages.)
+
+if len(self.filters)  0:
+selected, dependencies, others = self.filter_packages(self.filters, repo_packages)
+packages = selected
+packages.extend(dependencies)
+self.print_msg(Repo  + url +  has  + str(len(selected)) +  packages selected by filters.)
+self.print_msg(Repo  + url +  has  + str(len(dependencies)) +  packages dependencies.)
+self.print_msg(Repo  + url +  has  + str(len(others)) +  packages excluded.)
+
+for pack in others:
+# TODO: search for older versions of package and drop them too
+db_pack = rhnPackage.get_info_for_package(
+   [pack.name, pack.version, pack.release, pack.epoch, pack.arch],
+   self.channel_label)
+
+to_remove = False
+to_unlink = False
+if db_pack['path']:
+pack.path = os.path.join(CFG.MOUNT_POINT, db_pack['path'])
+if self.match_package_checksum(pack.path,
+pack.checksum_type, pack.checksum):
+# package is already on disk
+to_remove = True
+if db_pack['channel_label'] == self.channel_label or db_pack['channel_label'] == self.channel_label:
+# package is already in the channel
+to_unlink = True
+
+if to_remove or to_unlink:
+to_delete.append((pack, to_remove, to_unlink))
+else:
+self.print_msg(Repo  + url +  has no filters set.)
+packages = repo_packages
+
 for pack in packages:
 db_pack = rhnPackage.get_info_for_package(
[pack.name, pack.version, pack.release, pack.epoch, pack.arch],
@@ -327,15 +367,16 @@ class RepoSync:
 if db_pack['channel_label'] == self.channel_label: