Re: TypeError when calling oaiharvest from CLI
Dear Werner, First of all, welcome to Invenio and apologies for the late response. Indeed you have identified a problem in the legacy oaiharvest tool. I have provided a fix here: https://github.com/inveniosoftware/invenio/pull/3428 If you are familiar with git you can try applying this patch using git cherry-pick inside your virtualenv sources. Note that this oaiharvest tool is currently being migrated to a separate package called Invenio-OAIHarvester ( https://github.com/inveniosoftware/invenio-oaiharvester) which will be available soon. This will contain a new command line tool and web interface. Regarding the Internal Server Error you experienced, which URL were you trying to access? Did you try setting up an OAI PMH source in http://localhost:4000/admin/oaiharvest/oaiharvestadmin.py? Cheers, Jan --- Jan Age Lavik System Developer INSPIRE-HEP http://inspirehep.net Github: @jalavik https://github.com/jalavik Work phone: +41 22 76 78682 On Wed, Aug 5, 2015 at 11:01 AM, Werner Greßhoff werner.gressh...@uni-muenster.de wrote: Hello, at first I want to say, I'm new to Invenio and Python, so it might be my error or a misunderstanding! We've installed the Invenio 2.1-Version, installation was successful! Now we are trying to harvest from our existing repository some metadata beside the metadata delivered with the demo site. The call from the web frontend led to an Internal Server Error, so I tried the CLI instead with the following command: oaiharvest -vListRecords -pmarcxml -f2004-04-01 -u2004-05-31 -o/tmp/marc.xml http://repositorium-dev.uni-muenster.de/oai/miami Leading to the following message: Traceback (most recent call last): File ./oaiharvest, line 9, in module load_entry_point('invenio==2.1.1.dev20150616', 'console_scripts', 'oaiharvest')() File /home/system/.virtualenvs/invenio/src/invenio/invenio/base/helpers.py, line 50, in decorated_func result = f(*args, **kwargs) File /home/system/.virtualenvs/invenio/src/invenio/invenio/legacy/oaiharvest/scripts/oaiharvest.py, line 51, in main return oai_main() File /home/system/.virtualenvs/invenio/src/invenio/invenio/base/helpers.py, line 52, in decorated_func result = f(*args, **kwargs) File /home/system/.virtualenvs/invenio/src/invenio/invenio/legacy/oaiharvest/daemon.py, line 398, in main dummy2, dummy3) = urllib.parse(base_url) TypeError: 'Module_six_moves_urllib_parse' object is not callable As a beginner in Python I've no idea what is going wrong. I started with looking at the output of sys.path, but that's directing to the invenio library paths, so I guess Python is importing the correct library. Does someone have a clue where I'm going wrong?! -- Mit freundlichen Grüßen Werner Greßhoff Dezernat 2 - Digitale Dienste Universitäts- und Landesbibliothek Münster
Re: RFC on RFC (and wiki)
Hello! I guess this will be one of the final RFC's per e-mail only. I accept that :-) On Mon, May 19, 2014 at 4:00 PM, Tibor Simko tibor.si...@cern.ch wrote: Here are possible actions on them: Ad (1), move these to a new attractive Twitter Bootstrap powered web site for the project, like https://atom.io/ +1 This is simply cool. (Speaking of atom I do have some invites available I believe, although I cannot say if its any good as I do not own a Mac. Hit me up if you are interested.) Ad (2), these are collaboratively edited guides, aka real wiki pages. Move to GitHub Wiki? GitHub pages? Other CMS? GitHub pages maybe, as part of the (1) site or just GitHub wiki works too. Ad (3), these could come either built-in with the module sources (in reST), or else they may have evolved from earlier RFC, see next point. Seems like we are moving towards more in-source docs with reST and sphinx. I guess more verbose pages could be it's own .rst file under docs/ similar to how the few HowTos created on pu are. See https://github.com/jirikuncar/invenio/tree/pu/docs/developers Ad (4), we currently have some RFC-like discussions happening: (4b) via GitHub issues, for example: https://github.com/jirikuncar/invenio/issues/189 (4c) via Forum discussions, for example: https://forum.invenio-software.org/t/is-pep257-good-for-you/39/last I am for either GitHub issues or forum. Not sure what is best. GitHub is somewhat closer to code and makes for easier referencing of issues and/or code. Although the forum may be better for discussions. Hmm, I think my money goes towards GitHub issues. Ad (5) and (6), these could be auto-generated on the new user-facing site (see (1) above) from GitHub. Yay! This would be nice. Cheers, Jan
Re: Invenio is moving to GitHub
Great job! This is all looking promising. Regarding (2b), will this effort also include adding pull requests for those tickets which are in_review/in_integration. I.e. could we already start adding them as mentioned in (4)? Cheers, Jan --- Jan Age Lavik System Developer INSPIRE-HEP http://inspirehep.net Github: @jalavik https://github.com/jalavik Work phone: +41 22 76 78682 On Thu, May 1, 2014 at 7:18 AM, Tibor Simko tibor.si...@cern.ch wrote: On Wed, 30 Apr 2014, Tibor Simko wrote: We plan to start the final migration sometime after lunch. Done. With Jiri we migrated all the tickets; the process has now finished. Special thanks to Yoan who helped out with the migration script. (1) There were a couple of quirks along the way, so about a dozen tickets will get an update later today. (So please don't revoke your GitHub tokens yet.) (2) You can check the current status here: https://github.com/inveniosoftware/invenio/issues (2a) Cosmetic changes are coming, e.g. colouring of labels according to their respective namespaces. (2b) Then we'll have to do triaging of existing tickets in order to update their status. This will be part of the wider triaging team effort, as we discussed it during past Mondays. (2c) One notable thing that was not migrated are the keywords (because GitHub does not have a notion of free text labels). We may want to introduce some rare keywords simply as a free-text in the issue comments. As for more popular ones, we may want to introduce a new label namespace, say s_ for service. Example: s_INSPIRE, s_ILO. To be discussed. (3) As of today, please submit any new ticket directly on GitHub, for all the branches: https://github.com/inveniosoftware/invenio/issues/new (Note that I revoked all Trac rights from everyone anyway.) (4) As of today, please issue new pull requests directly on GitHub, for all the branches (maint/master/next; pu still in Jiri's personal space). (5) Travis-CI builds are activated for all branches starting from maint-1.0. So you'll see green/red light when asking for merge. I'll update our Git Workflow document accordingly with all the details when time permits: http://invenio-software.org/wiki/Tools/Git/Workflow Best regards -- Tibor Simko
Re: [PU Branch] Documentation with Sphinx
(Sorry for double posting - sent from wrong e-mail) Right now the docs seems to just be listed all in one page (see for example api.rst) without a detailed index or toc. This happens when we add the automodule syntax in the api.rst or modules.rst files. This will get messy as soon as all the other modules are added here. Will module level documentation be automatically collected without being listed in these files? If so, where would we find it? One of the things that would be nice to have is a page per module, for example with everything you need to know about tags module. This page may then even have another hierarchy of docs below detailing high to low level APIs, admin/CLI manuals etc. Overall, though, this new documentation scheme looks promising and we are ready to help shape it further :-) Cheers, Jan Cheers, Jan --- Jan Age Lavik System Developer INSPIRE-HEP http://inspirehep.net Github: @jalavik https://github.com/jalavik Work phone: +41 22 76 78682 On Thu, Mar 27, 2014 at 10:38 AM, Jan Åge Lavik jala...@gmail.com wrote: Right now the docs seems to just be listed all in one page (see for example api.rst) without a detailed index or toc. This happens when we add the automodule syntax in the api.rst or modules.rst files. This will get messy as soon as all the other modules are added here. Will module level documentation be automatically collected without being listed in these files? If so, where would we find it? One of the things that would be nice to have is a page per module, for example with everything you need to know about tags module. This page may then even have another hierarchy of docs below detailing high to low level APIs, admin/CLI manuals etc. Overall, though, this new documentation scheme looks promising and we are ready to help shape it further :-) Cheers, Jan On Thu, Mar 27, 2014 at 9:25 AM, Tibor Simko tibor.si...@cern.ch wrote: On Wed, 26 Mar 2014, Graham R. Armstrong wrote: Having gotten the API for Matcher in place, I'm wondering how we're planning to collect together documentation for PU. For APIs, the rough plan was to document methods inside restful files using rst docstrings, and then the overall machinery would collect and build the complete API docs out of these per-module inline docs. Best regards -- Tibor Simko
Re: [pu jsonalchemy] Aggregation of several fields into now
Hi Esteban! With the None approach, it fear it can get confusing when iterating over all authors (or whatever other field) as one then get None into the mix. If one really wants to get the first author, maybe calling first_author (where first_author is a direct lookup to _first_author) is enough and then expecting a None or empty list is alright? I dunno, it seems to me that when asking for give me all the authors, None does not belong there with John Ellis. Cheers, Jan Cheers, Jan --- Jan Age Lavik System Developer INSPIRE-HEP http://inspirehep.net Github: @jalavik https://github.com/jalavik Work phone: +41 22 76 78682 On Thu, Mar 27, 2014 at 3:38 PM, Esteban Gabancho esteban.jose.garcia.gaban...@cern.ch wrote: Hey guys! I have a question about the aggregation of several fields into one. Taking the example of the authors, lets say I have two fields `_first_author` and `_additional_authors` and I want to aggregate then into `authors`. The common case, and the easiest, is when I have one `_first_author` and cero or more `_additional_authors`, in which case I just put a list with the authors (what else right? :-) The problem, or the question, comes when I don’t have a `_first_author` in which case I’m not sure about the content of the `authors` field, it could be i) only the list of `_additional_authors` or ii) `None` follow by the the list of `_additional_authors`. I think the second solution is the closest one to reality, the `None` express that the record doesn’t have a first author. And I also think that we could apply this solution for other cases where we have this kind of situation (like with the `110__` and `710__`). What do you think? Lars, as you have already pu in production, how do you deal with this problem? Cheers, -- Esteban J. G. Gabancho
Re: chatroom future?
Hello, I also really like the current hangout experience. Mostly because it only requires a browser and is integrated already in the environments and devices I personally use. It is clear that most people's preference is dependent on their own preferred way of working and environment (politics or not). It's hard to reach a consensus that everyone is happy with. That said, the chat-room has unfortunately been forgotten lately (at least by me) and new developers in our team has not been made aware of it's existence either. That latter part is on us, but let's hope that this discussion will shed some needed light on it again and get developers to join up and help each other. Cheers, Jan On Tue, Oct 29, 2013 at 8:56 AM, Alexander Wagner a.wag...@fz-juelich.dewrote: On 29.10.2013 03:51, Tibor Simko wrote: Hi! in my case I simply find the current Hangout experience flawless. It is a good experience indeed, So, where do you hang out and I'll join in. I'll have to leave this other browser open, but well. politics However, personally, I strongly prefer /open/ in all regards. Open Access, Open Source, Open Protocols... Openness is the foundation of the Net. I don't like to throw that away and I do not understand why people throw our Net at G, M$, F and the like. I do not see a point in giving up the free Internet to company control just cause I'd have to read the manual to keep it free. So I really don't like nor use unsocial networks at all. /politics however there is always room for improvement, for instance in the client configurability department. Well its passed to G, M$, F whoever. That where it ends. It's not in your control anymore. Take what they give you and pay for it. For me, Jabber integrates better Agree. Though not on emacs as you ;) in my Emacs oriented workflow. See an incoming IM, press a key, answer message, press the key again, and voilà, back in the original work buffer. The Hangout client requires a bit more key presses and/or mouse movements... Agree. In any case its a matter of taste. We tend to use hangouts for video conferencing however I'd prefer something more open there as well. (For what it's worth German DFNs configs were not understood even by our geeks. So this has clearly some room for improvement. Evo went commercial and doesn't really like guests anymore, so...) The main point for me in a chat is to have a low footprint and fast way to communicate. Our experience at the hgf-project with Jabber is pretty good. Lengthy things to the list, short stuff in chat. (BTW: Sam, I can't imagine that a geek like you has trouble setting up a Jabber client ;) Anyway, I miss you in the chat room. Was always very helpful and fast.) -- Kind regards, Alexander Wagner Subject Specialist Central Library 52425 Juelich mail : a.wag...@fz-juelich.de phone: +49 2461 61-1586 Fax : +49 2461 61-6103 www.fz-juelich.de/zb/DE/zb-fi --**--** --**-- --**--** --**-- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt --**--** --**-- --**--** --**--
Re: RFC on-demand build system to test your personal branches
Very nice! On Wed, Jul 10, 2013 at 12:01 PM, Tibor Simko tibor.si...@cern.ch wrote: [*] We could make this a topic for the next week Invenio Developer Forum. Can demo existing features, muse about pros/cons, describe the plans and wishlists, and hold a Jenkins account creation party. +1 Cheers, Jan
Re: Invenio Developer Forum - Web APIs - today at 16:45 CET
Nice summary. I just have one small comment regarding the API versioning. I like the approach to use the headers, allowing clients to not specify the API version directly in the URIs. Or if no indication is given at all, it will use the latest version possible by default for the service/verb. Perhaps even something like this guy propose ( http://thereisnorightway.blogspot.ch/2011/02/versioning-and-types-in-resthttp-api.html) where the API version is part of the request type. Cheers, Jan On Mon, Mar 18, 2013 at 1:19 PM, Tibor Simko tibor.si...@cern.ch wrote: On Mon, 11 Mar 2013, Tibor Simko wrote: Today, we thought of musing about Web API standardisation process, including issues such as: (a) centralising all the various existing APIs to use leading `/api' URL space; (b) API versioning; (c) plugin system for modules; (d) API job and export task queue manager, possibly using Celery. I've updated the wiki page summarising the API musings: http://invenio-software.org/wiki/Talk/WebAPIs#a3.NewAPIs Still a work in progress, still containing undecided issues, but shows the overall direction we thought of taking. Please share any comments you may have. Best regards -- Tibor Simko
Re: [INSPIRE-DEV] Why I do not use regression tests as often as I should
Hello guys, Interesting discussion of a topic that I think deserves one. As all of you surely are, I am in favor of introducing more testing, unit-tests or regression, overall in Invenio (and INSPIRE). It makes for better quality of life :-) There are many small things that have bitten us because of lack of test coverage, and as Piotr touches upon, if one wants to change something that may have minor or major repercussions one feels almost scared of the consequences in doing it - meaning that new bugs not covered by tests appears and breaks the application. Over on the INSPIRE team, we are planning to introduce our own little ecosystem of code review and testing requirements before shipping to production (or master codebase) - in addition to the other best practices of Invenio development, of course. For example, we are thinking of requiring the implementation of tests for bugs that appear and gets fixed as well as for new features before shipping out (to production or Invenio master). This enforcement, and making sure the tests make sense, will be part of the responsibility for the code reviewers which we hope to do much more of in the near future. We see this kind of set-up is more and more common in other software projects and we hope it is something we can take advantage of. More on this later. On the topic of deliberate failing tests in the codebase: I think these should definitely be made very clear to be intended to fail, and when running the normal unit-test suite they should not come up as normal errors/failures - that is potentially confusing and time inefficient as Piotr points out. In addition, if some test case is deliberately failing because it should be fixed by someone, I think there are better alternatives to bring awareness to it. For example ticketing systems such as Trac. Cheers, Jan On 07/20/2012 10:59 AM, Alessio Deiana wrote: On Jul 19, 2012, at 4:48 PM, Piotr Praczyk wrote: Hello ! From: Samuele Kaplun Hi Piotr, In data mercoledì, 18 luglio 2012 11.10:23, Piotr Praczyk ha scritto: It is less related to the failing tests themselves, but if the mechanisms of testing as quality measure of the software does not work, it does not motivate to write new regression/unit tests. On the other hand some people prefer to follow the test-driven-development, by first implementing tests for functionalities that not yet exists so that they will be eventually implemented as originally designed. YMMV. Up to now, my understanding of test-driven development was slightly different and I did not even think about such approach. What I noticed is that we always test our code when coding. The problem is that test is manual. By taking the time to automate it, you get your time back very fast since you are basically running tests all day: adding some coding then testing, adding some code, testing etc. I always though about writing tests for currently implemented feature and satisfying them before a commit. I think, the weakness of this approach lies in the size of a team that is collaborating on a project. If You have small number of frequently communicating developers, this can work. If everyone starts uploading failing tests, all other developers have to deal with them and see results (which results in issues described in previous e-mail). It is very difficult to distinguish tests that should fail because things are not implemented from these that fail because something is broken. Moreover, commiting failing tests carries a risk of commiting tests which will never succeed because they are simply not well written (This is the case with at least one currently commited test). This also leads to trouble. Maybe it is better to use task tracking system for new features or at least mark tests as not-satisfied, so that testing framefowk could distinguish them automatically ? I guess having a separate repository for tests of future features (one branch related with one ticket) is a bit of an overkill on the side of infrastructure, but we could think about something better than now. I am using nose as a wrapper around unittests. nose has a way of skipping tests. Check http://nose.readthedocs.org/en/latest/plugins/skip.html We could have a similar concept so that we know which tests are supposed to fail and they do not bother us unless we want to. -- Alessio Deiana INSPIRE Developer GS-SIS-OA CERN -- -- Jan Åge Lavik CERN System Librarian GS-SIS Office: 3-1-014 Mailbox: C27800
Re: How to deal with change of syntax of an invenio.conf variable
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Samuele, On 04/05/2012 02:28 PM, Alexander Wagner wrote: This is currently considered as a comma-separated list of authorized user agent strings while I would like to change it into a regular expression that will be used to match incoming user-agent strings. Does this mean comma separated list of strings representing regular expressions or just one expression to cover all cases? The former would be less intrusive to existing values. And if the latter, then when reading the variable one could treat it as one item list or something, using the assumptions of the former. If I understood correctly. Cheers, Jan The problem of course is for Invenio users who have customized this variable to know that they will need to update this value. Wouldn't it be cleaner to use a new variable and let invenio throw a complaint if it finds the old one? E.g. an exception that notifies root about the issue? Or similarly if it finds the wrong syntax... -- Kind regards, Alexander Wagner Subject Specialist Central Library 52425 Juelich mail : a.wag...@fz-juelich.de phone: +49 2461 61-1586 Fax : +49 2461 61-6103 www.fz-juelich.de/zb/DE/zb-fi Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt Kennen Sie schon unsere app? http://www.fz-juelich.de/app - -- - -- Jan Åge Lavik CERN System Librarian GS-SIS Office: 3-1-014 Mailbox: C27800 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPfjDoAAoJEC02y7lWYDZkqxkH/3vTMuBzQKl83/elhHYlVzUb /d4UrmjT2eZUGI1rkByj94zoe2nO10X0DcMgsbV0+0unhiPOUCSO7Svwu7pbTFsW 2fqP2oULHmHUr6ivTRWrhGyUN29T3DhiOCYh99DMXa3UumZDPUBoKv7A8TOyHpOE ezvZu3UQXl9ETScq1IpbcwTLgM0n3U9utBXgBQTrxINsKHCJRka4QADB3pny33bW jz3OZOyxaQJcY0tmTWRW3y3QTuqHv6mp2sx4x37hLlJ6ebmCKIofUrfNyQi4yIXj DtKvzpM5gpf7xhuKbBL7bxwkXLKpxRgCq5s+dj+UdR5Sc1n+5IkIMI7uPnGXuHY= =3peW -END PGP SIGNATURE-
Re: Invenio Holding Pen development status
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Sam, Currently the holding pen is planned to be revisited/updated with the new ingestion workflow that Roman and I are working on for INSPIRE. He started this job with the pink elephant (we are lacking a module name atm :): https://github.com/romanchyla/invenio/tree/pink-elephant One of the ideas here is to allow for holding-pen records to be indexed and searchable in specific holding-pen indexes. And then using BibFormat to allow for customisable format for holding pen records etc. Currently, I do not believe there is a way to associate persons with holding pen records, but it is something we have been discussing to add. For example, the cataloger workflow could be something like: * Cataloger logs on to the site and views the holding pen * The cataloger can then select various records (or group of records) and assign them to themselves etc. But this entire procedure is not set in stone yet. The current holding pen interface lives in the OAI harvest admin page. Cheers, Jan On 03/29/2012 05:46 PM, Samuele Kaplun wrote: Dear all, http://invenio-software.org/ticket/496 (by Joe and Javier) was proposing an extension to robotupload to allow for sending records into the holding pen and notifying some responsible about them. http://invenio-software.org/ticket/152 (by Pablo) was re-inventing the holding pen inside bibsched. Since I am recently currently working on both robotupload API and bibsched quarantine, I was thinking to reuse the holding pen for quarantined records. Now comes the question: who is currently working on the holding pen? Is there already a way for responsible people to be notified about incoming records in the holding pen (such as proposed in a comment by Tibor in the above ticket 496)? Is there a centralized and usable web interface in Invenio to have an overview of the holding pen status? Cheers! Sam - -- - -- Jan Åge Lavik CERN System Librarian GS-SIS Office: 3-1-014 Mailbox: C27800 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPdI25AAoJEC02y7lWYDZkzHIIAI94QfJnmVF+nMyx9V95UZkf irp1pYr+tiIxs335f3+jnYQ2LXwzDTYDzl1u/UDJObCnBIwEhtowhZCYB8Bomi43 PerVFIO+ynH7D7IvwwXqvFyJjymll3BJ1H7T620w+QxCQeey2TA8i/95GXV4QsVT +s1BfyhfmvahYVnc/uvd5nYMgLSVeuxpqIuHqEG8NYF0iCLTP2WbLodiIQ0CiSPx /CUFb5Iihkpkmg+fOSNgqqMt+2trIZntsMHVBtxHezeP0malPSOXiQ5nTZKieS3c ++GNNFVXaG6ERuuq7YebXh5koVFUNVtBC4kyCNtocW2YtP5qbNvr7Cmf4AXOmxk= =sHxZ -END PGP SIGNATURE-