from:"Phillip Lord"

UK Ontology Network (UKON) 2016 - Last Call for Participation [Deadline: 31st March, 2016]

2016-03-29 Thread Phillip Lord




The Fifth UK Ontology Network meeting (#ukon2016) will take place on Thursday
April 14th, 2016 at Newcastle University, Newcastle upon Tyne. The aims of
this meeting are as follows:

  To enable dissemination of ontology relevant work from across multiple
  disciplines

  To encourage collaboration and cooperation between different members of UK
  organisations working in this area

  To help establish a research agenda in ontology and better communication
  with funding councils and industry needs


The full programme is now available, and we have a fascinating series of
talks, with demo and poster sessions. The meeting will also offer plenty of
opportunities for networking.

http://www.ukontology.org/programme


Registration for UKON 2016 is now available. Please register before 31st of
March, using the link below:

http://www.ukontology.org/registration

Some hotels close to the venue offer reduced rates for UKON delegates. You can
use the link below to take advantage of special rates:

http://www.newcastlegateshead.com/UKON2016

Best wishes, and we look forward to seeing you in Newcastle.

Phillip Lord
James Malone
Goksel Misirli
Jennifer Warrender
Claire Smith


UKON 2016 Organisers

Re: Scholarly paper in HTML+RDF through RASH

2015-05-27 Thread Phillip Lord

Silvio Peroni  writes:

> Hi Marynas,
>
> first of all, thanks for your comments!
>
> A couple of answers, motivating why we didn’t originally choose to use the 
> HTML elements you suggested:
>
>> A couple remarks regarding HTML:
>>  could be 
>> http://www.w3.org/TR/html401/struct/text.html#edef-CODE 
>> 
>
> Basically, it was made on purpose as a design choice. In RASH we wanted to
> keep everything much easier, in particular when defining similar behaviour in
> different contexts (e.g., in inline elements and in block elements). If we use
> the full HTML approach as you suggested, I should use different tags for
> defining codes. In particular:
>
> Inline code definition: 
> This text contains a call to a function in italics as 
> an inline element.
>
> Block code definition:
> This is a full block of code
>
> As you can see, to have both situations I should use at least two additional
> elements of HTML (and thus I should have to extend RASH). In addition, to
> define block code, I should use *two* elements together.


Still, the  combination is recommended in the HTML5
documentation. And tools like prism.js, for instance, support it
out-of-the-box. Simple is important, but not if it introduces complexity
elsewhere.


Phil

Re: Vocabulary to describe software installation

2015-05-05 Thread Phillip Lord


This might be related!

http://www.interition.net/i/products/sparqlingcode.html


Jürgen Jakobitsch  writes:

> hi,
>
> i'm investigating possibilities to describe an arbitrary software
> installation process
> in rdf. currently i've found two candidates [1][2], but examples are
> practically non-existent.
> has anyone done this before, are there somewhere real examples?
>
> any pointer greatly appreciated.
>
> wkr j
>
> [1] http://www.w3.org/2005/Incubator/ssn/ssnx/ssn#Module_Deployment
> [2]
> http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#dcterms:instructionalMethod
>
> | Jürgen Jakobitsch,
> | Software Developer
> | Semantic Web Company GmbH
> | Mariahilfer Straße 70 / Neubaugasse 1, Top 8
> | A - 1070 Wien, Austria
> | Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22
>
> COMPANY INFORMATION
> | web   : http://www.semantic-web.at/
> | foaf  : http://company.semantic-web.at/person/juergen_jakobitsch
> PERSONAL INFORMATION
> | web   : http://www.turnguard.com
> | foaf  : http://www.turnguard.com/turnguard
> | g+: https://plus.google.com/111233759991616358206/posts
> | skype : jakobitsch-punkt
> | xmlns:tg  = "http://www.turnguard.com/turnguard#";

-- 
Phillip Lord,   Phone: +44 (0) 191 208 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Reference management

2014-10-13 Thread Phillip Lord


I don't think we have ever released it, but it is planned!

Phil


Mark Fallu  writes:

> Hi Phil,
>
> Nice work on Greycite - it looks like a very useful utility.
>
> Is the sourcecode for Greycite available?
>
> Cheers,
>
> Mark 
>
> Sent from my iPhone
>
>> On 9 Oct 2014, at 9:56 pm, Phillip Lord  wrote:
>> 
>> 
>> 
>> 
>> Simon Spero  writes:
>> 
>>>> On Oct 8, 2014 10:15 AM, "Gray, Alasdair"  wrote:
>>>> 
>>>> Or is that because they want to import it into their own reference
>>> management system, e.g. Mendeley, which does not support the HTML version?
>>> 
>>> 1. It is quite easy to embedded metadata in HTML pages in forms designed
>>> for accurate importing into reference managers (Hellman 2009). Mendeley has
>>> been known to have problems with imports in cases where a proxy server is
>>> involved.
>> 
>> Myself and Lindsay Marshall have done a fair amount of work extracing
>> metadata from HTML for purposes of citation. With a fair amount of
>> heuristics, we can get enough metadata for a full citation from about
>> 60% of what you might call serious websites (i.e. those with technical
>> content). The general web is lower (about 1%) but most of the web
>> appears to be chinese pornography.
>> 
>> This is available as a tool at http://greycite.knowledgeblog.org/.
>> 
>> And fuller description is available at http://arxiv.org/abs/1304.7151.
>> 
>> Phil
>> 
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Reference management

2014-10-09 Thread Phillip Lord

Simon Spero  writes:

> On Oct 8, 2014 10:15 AM, "Gray, Alasdair"  wrote:
>
>> Or is that because they want to import it into their own reference
> management system, e.g. Mendeley, which does not support the HTML version?
>
> 1. It is quite easy to embedded metadata in HTML pages in forms designed
> for accurate importing into reference managers (Hellman 2009). Mendeley has
> been known to have problems with imports in cases where a proxy server is
> involved.

Myself and Lindsay Marshall have done a fair amount of work extracing
metadata from HTML for purposes of citation. With a fair amount of
heuristics, we can get enough metadata for a full citation from about
60% of what you might call serious websites (i.e. those with technical
content). The general web is lower (about 1%) but most of the web
appears to be chinese pornography.

This is available as a tool at http://greycite.knowledgeblog.org/.

And fuller description is available at http://arxiv.org/abs/1304.7151.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-08 Thread Phillip Lord



I'm always at a bit of a loss when I read this sort of thing. Kerning,
seriously? We can't share scientific content in HTML because of kerning?

In practice, web browsers do a perfectly reasonable job of text layout,
in real time, and do it in a way that allows easy reflowing. The 
thing about Sarven's LNCS style sheets, for instance, is that I like the
most is that I can turn them off; I don't like the LNCS format.

Having said all of that, 5 minutes of googling suggests that, kerning
support is in Canditate Recommendation form from W3C, and at least three
different JS libraries that support it.

Phil

Luca Matteis  writes:
> I really appreciate the work that you're doing with trying to style an
> HTML page to look similar to the Latex templates. But there's so many
> typesetting details that are not available in browsers, which means
> you're going to do a lot of DOM hacking to be able to produce the same
> quality typography that Latex is capable of. Latex will justify text,
> automatically hyphenate, provide proper spacing, and other typesetting
> features. Not to mention kerning. Kerning is a *huge* thing in
> typography and with HTML you're stuck with creating a DOM element for
> every single letter - yup you heard me right.
>
> I think it would be super cool to create some sort of JavaScript
> framework that would enable the same level of typography that Latex is
> capable of, but you'll eventually hit some hard limitations and you'll
> probably be stuck drawing on a canvas.
>
> What are your ideas regarding these problems?
>
> On Wed, Oct 8, 2014 at 2:26 PM, Sarven Capadisli  wrote:
>> On 2014-10-08 14:10, Peter F. Patel-Schneider wrote:
>>>
>>> Done.
>>>
>>> The goal of a new paper-preparation and display system should, however,
>>> be to be better than what is currently available.  Most HTML-based
>>> solutions do not exploit the benefits of HTML, strangely enough.
>>>
>>> Consider, for example, citation links.  They generally jump you to the
>>> references section.  They should instead pop up the reference, as is
>>> done in Wikipedia.
>>>
>>> Similarly for links to figures.  Instead of blindly jumping to the
>>> figure, they should do something better, perhaps popping up the figure
>>> or, if the figure is already visible, just highlighting it.
>>>
>>> I have put in both of these as issues.
>>
>>
>> Thanks a lot for the issues! Really great to have this feedback.
>>
>> I have resolved and commented on some of those already, and will look at the
>> rest very shortly.
>>
>> I am all for improving the interaction as well. I'd like to state again that
>> the development was so far focused on adhering to the LNCS/ACM guidelines,
>> and improving the final PDF/print product. That is to get on reasonable
>> grounds with the "state of the art".
>>
>> Moving on: I plan to bring in the interaction and framework to easily
>> semantically enrich the document as well as the overall UX. I have some
>> preliminary code in my dev branch, and will bring it forward, and would like
>> feedback as well.
>>
>> Thanks again and please continue to bring forward any issues or feature
>> requests. Contributors are most welcome!
>>
>> -Sarven
>> http://csarven.ca/#i
>>
>>
>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: scientific publishing process (was Re: Cost and access)

2014-10-08 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
> The goal of a new paper-preparation and display system should, however, be to
> be better than what is currently available.  Most HTML-based solutions do not
> exploit the benefits of HTML, strangely enough.
>
> Consider, for example, citation links.  They generally jump you to the
> references section.  They should instead pop up the reference, as is done in
> Wikipedia.

Yes, I agree. I do this on my blog or rather provide it as an option.
The reference list is also automatically generated here, so, for
example, there is no metadata associated with the two references in
this post:

http://www.russet.org.uk/blog/3015

In both cases, the reference list is formed from the metadata on the
other end of the link, gathered either from the HTML, or in the case of
arXiv from their XML-RPC interface.

> Similarly for links to figures.  Instead of blindly jumping to the figure,
> they should do something better, perhaps popping up the figure or, if the
> figure is already visible, just highlighting it.

Or better still, providing access to the code and data from which the
figure is derived.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-08 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

> PLOS is an interesting case.  The HTML for PLOS articles is relatively
> readable.  However, the HTML that the PLOS setup produces is failing at math,
> even for articles from August 2014.
>
> As well, sometimes when I zoom in or out (so that I can see the math better)
> Firefox stops displaying the paper, and I have to reload the whole page.

Interesting bug that. Worth reporting to PLoS.

> Strangely, PLOS accepts low-resolution figures, which in one paper I looked at
> are quite difficult to read.

Yep. Although, it often provides several links to download higher
res images, including in the original file format. Quite handy.

> However, maybe the PLOS method can be improved to the point where the HTML is
> competitive with PDF.

Indeed. For the moment, HTML views are about 1/5 of PDF. Partly this is
because scientists are used to viewing in print format, I suspect, but
partly not.

I'm hoping that, eventually, PLoS will stop using image based maths. I'd
like to be able to zoom maths independently, and copy and paste it in
either mathml or tex. Mathjax does this now already.

Phil

Re: Cost and access (Was Re: [ESWC 2015] First Call for Paper)

2014-10-07 Thread Phillip Lord

"Gray, Alasdair"  writes:

> On 7 Oct 2014, at 15:31, Phillip Lord
> mailto:phillip.l...@newcastle.ac.uk>>
>  wrote:
>
> "Gray, Alasdair" mailto:a.j.g.g...@hw.ac.uk>> writes:
> This is true. So, if the reason that ESWC and ISWC only accept papers in
> PDF is because we need LNCS for tenure and that they will only take PDF,
> it would be good to have a public statement about this.
>
> I think PDF is only at the submission stage. For camera ready the source file
> (s) - latex or word - are required.
>
> Again, I'd like to know for sure.
>
> For ISWC this year, it was certainly the case that I needed to submit the
> latex for the camera ready version.
>
> This presumably is for Springer/conference organisers to be able to get all
> the appropriate metadata that they add for indexing.


Sorry, I meant, I'd love to know for sure where the restriction on PDF
comes from. Could we change it to allow HTML tomorrow and who would
complain. 

We have seen some people already (Peter!), but I'd like to know where
the limiting factor is.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
 tex4ht takes the slight strange approach of having an strange and
 incomprehensible command line, and then lots of scripts which do default
 options, of which xhmlatex is one. In my installation, they've only put
 the basic ones into the path, so I ran this with
 /usr/share/tex4ht/xhmlatex.

 Phil

>>>
>>> So someone has to package this up so that it can be easily used.  Before 
>>> then,
>>> how can it be required for conferences?
>>
>> http://svn.gnu.org.ua/sources/tex4ht/trunk/bin/ht/unix/xhmlatex
>
> Somehow this is not in my tex4ht package.
>
> In any case, the HTML output it produces is dreadful.   Text characters, even
> outside math, are replaced by numeric XML character entity references.

So, I am willing to spend some time getting this to work. I would like
to plug some ESWC papers into tex4ht, to get some HTML which works plain
and also with Sarven's templates so that it *looks* like a PDF.

Would you be willing to a) try it and b) give worked and short test
cases for things that do not work?

Phil

Re: Cost and access (Was Re: [ESWC 2015] First Call for Paper)

2014-10-07 Thread Phillip Lord

"Gray, Alasdair"  writes:
>> This is true. So, if the reason that ESWC and ISWC only accept papers in
>> PDF is because we need LNCS for tenure and that they will only take PDF,
>> it would be good to have a public statement about this.
>
> I think PDF is only at the submission stage. For camera ready the source file
> (s) - latex or word - are required.

Again, I'd like to know for sure.

> Also in this brave new world, how would the length of a submission be 
> determined?

There are lots of alternative measures. Word limits would work.

Page based limits are pretty daft anyway. I am sure that you, like I,
have do some strange \baselineskip fiddling or shrunk a figure to 99,
then 98, then 97% until it finally fits, although it isn't entirely
visible any more. Word-limits avoid this.

For myself, I would drop word limits as well, and specify a reading time
of around 30 minutes. I have certainly gone through papers in the past
and made them less readable so that they fit within the page limit. Ever
removed all your adjectives? What about replacing conjunctions with
punctuation? If the reviewers get bored ploughing through an overly long
paper, they just send a review with tl;dr.

One of the interesting thing about innovating with the publication
process is that it helps to find out what about a scientific paper we
actually care about and what are just hang overs from our past.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
>
> So, you believe that there is an excellent set of tools for preparing,
> reviewing, and reading scientific publishing.
>
> Package them up and make them widely available.  If they are good, people will
> use them.
>
> Convince those who run conferences.  If these people are convinced, then they
> will allow their use in conferences or maybe even require their use.

Is that not the point of the discussion?

Unfortuantely, we do not know why ISWC and ESWC insist on PDF.

> I'm not convinced by what I'm seeing right now, however.

Sure, but at least the discussion has meant that you have looked at some
of the tools again. That's no bad thing.

My question would be, are more convinced than you were last time you
looked or less?

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

> On 10/06/2014 11:00 AM, Phillip Lord wrote:
>> "Peter F. Patel-Schneider"  writes:
>>
>>> On 10/06/2014 09:32 AM, Phillip Lord wrote:
>>>> "Peter F. Patel-Schneider"  writes:
>>>>>> Who cares what the authors intend? I mean, they are not reading the
>>>>>> paper, are they?
>>>>>
>>>>> For reviewing, what the authors intend is extremely important.  Having
>>>>> different rendering of the paper interfere with the authors' message is
>>>>> something that should be avoided at all costs.
>>>>
>>>> Really? So, for example, you think that a reviewer with impared vision
>>>> should, for example, be forced to review a paper using the authors
>>>> rendering, regardless of whether they can read it or not?
>>>
>>> No, but this is not what I was talking about. I was talking about
>>> interfering with the authors' message via changes from the rendering
>>> that the authors' set up.
>>
>> It *is* exactly what you are talking about.
>
> Well, maybe I was not being clear, but I thought that I was talking about
> rendering  changes interfering with comprehension of the authors' intent.


And if only you had a definition of "rendering changes that interfere
with authors intent" as opposed to just "rendering changes".

I can guarantee that rendering a paper to speech WILL change at least
some of the authors intent because, for example, figures will not
reproduce. You state that this should be avoided at all costs.

I think this is wrong. There are many reasons to change rendering. That
should be the readers choice.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

>> tex4ht takes the slight strange approach of having an strange and
>> incomprehensible command line, and then lots of scripts which do default
>> options, of which xhmlatex is one. In my installation, they've only put
>> the basic ones into the path, so I ran this with
>> /usr/share/tex4ht/xhmlatex.
>>
>>
>> Phil
>>
>
> So someone has to package this up so that it can be easily used.  Before then,
> how can it be required for conferences?

http://svn.gnu.org.ua/sources/tex4ht/trunk/bin/ht/unix/xhmlatex

>
> I have tex4ht installed, but there is no xhmlatex file to be found.  I managed
> to find what appears to be a good command line

I don't know why that would be. It is installed with the debian package,
although as I said, it is not in the system path. I found it with dpkg
-S. Am afraid it's a long time since I used an RPM based system, so I
can't remember how do this on fedora.

>
> htlatex schema-org-analysis.tex "xhtml,mathml" " -cunihtf" "-cvalidate"
>
> This looks better when viewed, but the resultant HTML is unintelligible.
>
> There is definitely more work needed here before this can be considered as a
> potential solution.

Yes, I agree.

So, the question is how to enable this. One way would, for example, be
for ISWC and ESWC to accept HTML and have a prize for the best semantic
paper submitted. Then people with the inclination would do the work.

Again, I suspect it's not that much, but we will not know until we try.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

Norman Gray  writes:
>
> This won't dynamically reflow, it's true (and that's a pity), but if I ever
> get a tablet computer, I doubt I'll be able to resist producing versions in a
> layout which is targeted at that size of screen.

Sure, that's fine. But why not have a version which behaves reasonably
at all screen sizes. This should be achievable.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-07 Thread Phillip Lord

The stack exchange discussion mostly talks about the user side of
things. Go back (quite) a few years and using PDF from tex was a pain,
pretty much up until pdflatex become the norm. 

For those who thing that latex is still the best, I do not see that an
HTML centric publishing framework should be a barrier. If the majority
of papers were being produced from Word, then it might be more of an
issue. 

Phil

Luca Matteis  writes:

> Sorry to jump into this once again but when it comes to typesetting
> nothing really comes close to Latex/PDF:
> http://tex.stackexchange.com/questions/120271/alternatives-to-latex -
> not even HTML/CSS/JavaScript
>

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

> On 10/06/2014 09:32 AM, Phillip Lord wrote:
>> "Peter F. Patel-Schneider"  writes:
>>>> Who cares what the authors intend? I mean, they are not reading the
>>>> paper, are they?
>>>
>>> For reviewing, what the authors intend is extremely important.  Having
>>> different rendering of the paper interfere with the authors' message is
>>> something that should be avoided at all costs.
>>
>> Really? So, for example, you think that a reviewer with impared vision
>> should, for example, be forced to review a paper using the authors
>> rendering, regardless of whether they can read it or not?
>
> No, but this is not what I was talking about. I was talking about
> interfering with the authors' message via changes from the rendering
> that the authors' set up.

It *is* exactly what you are talking about. If I want to render your
document to speech, then why should I not? What I am saying is that,
you, the author, should not wish to constrain the rendering, only really
the content. Effectively, if you are using latex, you are already doing
this, since latex defines the layout and not you.

But, I think we are talking in too abstract a term here. Should you be
able to constrain indentation for code blocks? Yes, of course, you
should. But, a quick look at the web shows that people do this all the
time.

>>> Similarly for reading papers, if the rendering of the paper interferes
>>> with the authors' message, that is a failure of the process.
>>
>> Yes, I agree. Which is why, I believe, that the rendering of a paper
>> should be up to the reader
>
> As this is why I believe that the authors' should be able to specify the
> rendering of their paper to the extent that they feel is needed to convey the
> intent of the paper.

For scientific papers, I think this really is not very far. I mean, a
scientific paper is not a fashion store; it's a story designed to
persuade with data. 

I would like to see papers which are in the hands of the reader as much
as possible. Citation format should be for the reader. Math
presentation. Graphs should be interactive and zoomable, with the data
underneath as CSV. 

All of these are possible and routine with HTML now. I want to be free
to choose the organisation of my papers so that I can convey what I
want. At the moment, I cannot. The PDF is not reasonable for all, maybe
not even most of this. But some.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

> On 10/06/2014 09:28 AM, Phillip Lord wrote:
>> "Peter F. Patel-Schneider"  writes:
>>>> It does MathML I think, which is then rendered client side. Or you could
>>>> drop math-mode straight through and render client side with mathjax.
>>>
>>> Well, somehow png files are being produced for some math, which is a 
>>> failure.
>>
>> Yeah, you have to tell it to do mathml. The problem is that older
>> versions of the browsers don't render mathml, and image rendering was
>> the only option.
>
> Well, then someone is going to have to tell people how to do this.  What I saw
> for htlatex was that it just did the right thing.

So, htlatex is part of TeX4Ht which does HTML. 

If you do xhmlatex then you get XHTML with, indeed, math mode in MathML.
So, for example, this output comes with the default xhmlatex.

http://www.w3.org/1998/Math/MathML";  
display="inline" >e = mc2

tex4ht takes the slight strange approach of having an strange and
incomprehensible command line, and then lots of scripts which do default
options, of which xhmlatex is one. In my installation, they've only put
the basic ones into the path, so I ran this with
/usr/share/tex4ht/xhmlatex.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
>> Who cares what the authors intend? I mean, they are not reading the
>> paper, are they?
>
> For reviewing, what the authors intend is extremely important.  Having
> different rendering of the paper interfere with the authors' message is
> something that should be avoided at all costs.

Really? So, for example, you think that a reviewer with impared vision
should, for example, be forced to review a paper using the authors
rendering, regardless of whether they can read it or not?

Of course, this is an extreme example, although not an unrealistic one.
It is fundamentally any different from my desire as I get older to be
able to change font size and refill paragraphs with ease. I see a
difference of scale, that is all.

> Similarly for reading papers, if the rendering of the paper interferes
> with the authors' message, that is a failure of the process.

Yes, I agree. Which is why, I believe, that the rendering of a paper
should be up to the reader.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
>> It does MathML I think, which is then rendered client side. Or you could
>> drop math-mode straight through and render client side with mathjax.
>
> Well, somehow png files are being produced for some math, which is a failure.

Yeah, you have to tell it to do mathml. The problem is that older
versions of the browsers don't render mathml, and image rendering was
the only option.

> I don't know what the way to do this right would be, I just know that the
>
> There are many cases where line breaks and indentation are important for
> understanding.  Getting this sort of presentation right in latex is a pain for
> starters, but when it has been done, having the htlatex toolchain mess it up
> is a failure.

Indeed. I believe that there are plans in future versions of HTML to
introduce a "pre" tag which prefers indentation and line breaks.

>> Which gets us back to the chicken and egg situation. I would probably do
>> this; but, at the moment, ESWC and ISWC won't let me submit it. So, I'll
>> end up with the PDF output anyway.
>
> Well, I'm with ESWC and ISWC here.  The review process should be designed to
> make reviewing easy for reviewers.

I *only* use PDF when reviewing. I never use it for viewing anything
else. I only use it for reviewing since I am forced to. 

Experiences differ, so I find this a far from compelling argument.

>> This is why it is important that web conferences allow HTML, which is
>> where the argument started. 

> Why?  What are the benefits of HTML reviewing, right now?  What are the
> benefits of HTML publishing, right now?

Well, we've been through this before, so I'll not repeat myself.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
> I would be totally astonished if using htlatex as the main way to produce
> conference papers were as simple as this.
>
> I just tried htlatex on my ISWC paper, and the result was, to put it mildly,
> horrible.  (One of my AAAI papers was about the same, the other one caused an
> undefined control sequence and only produced one page of output.)   Several
> parts of the paper were rendered in fixed-width fonts.  There was no attempt
> to limit line length.  Footnotes were in separate files.

The footnote thing is pretty strange, I have to agree. Although
"footnotes" are a fairly alien concept wrt to the web. Probably hover
overs would be a reasonable presentation for this.

> Many non-scalable images were included, even for simple math.

It does MathML I think, which is then rendered client side. Or you could
drop math-mode straight through and render client side with mathjax.

> My carefully designed layout for examples was modified in ways that
> made the examples harder to understand. 

Perhaps this is a key difference between us. I don't care about the
layout, and want someone to do it for me; it's one of the reasons I use
latex as well.

> That said, the result was better than I expected.  If someone upgrades htlatex
> to work well I'm quite willing to use it, but I expect that a lot of work is
> going to be needed.

Which gets us back to the chicken and egg situation. I would probably do
this; but, at the moment, ESWC and ISWC won't let me submit it. So, I'll
end up with the PDF output anyway.

This is why it is important that web conferences allow HTML, which is
where the argument started. If you want something that prints just
right, PDF is the thing for you. If you you want to read your papers in
the bath, likewise, PDF is the thing for you. And that's fine by me (so
long as you don't mind me reading your papers in the bath!). But it
needs to not be the only option.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:
> However, my point was not about looking good.  It was about being able to see
> the paper in the way that the author intended. 

Yes, I understand this. It's not something that I consider at all
important, which perhaps represents our different view points. Readers
have different preferences. I prefer reading in inverse video; I like to
be able to change font size to zoom in and out. I quite like fixed width
fonts. Other people like the two column thing. Other people want things
read to them.

Who cares what the authors intend? I mean, they are not reading the
paper, are they?

> I do write papers with considerable math in them, so my experience may
> not be typical, but whenever I have tried to produce HTML versions of
> my papers, I have ended up quite frustrated because even I cannot get
> them to display the way I want them to.

I've been using mathjax on my website for a long time and it seems to
work well, although I am not maths heavy.

> It may be that there are now good tools for producing HTML that carries the
> intent of the author.  htlatex has been mentioned in this thread.  A solution
> that uses htlatex would have the benefit of building on much of the work that
> has been done to make latex a reasonable technology for producing papers.  If
> someone wants to create the necessary infrastructure to make htlatex work as
> well as pdflatex does, then feel free.

It's more to make htlatex work as well as lncs.sty works. htlatex
produces reasonable, if dull, HTML of the bat.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

Sarven Capadisli  writes:

> I will bet that if the requirements evolve towards Webby submissions, within
> 3-5 years time, we'd see a notable change in how we collect, document and mine
> scientific research in SW. This is not just being "hopeful". I believe that if
> all of the newcomers into the (academic) research scene start from HTML (and
> friends) instead of LaTeX/Word (and friends), we wouldn't be having this
> discussion. If the newcomes are told to deal with LaTeX/Word (regardless of
> hand coding or using a WYSIWYG editor) today, they are going to do exactly
> that.

I would look at an environment which has less external force. The free
software engineering community produces it's documents in a very
wide-range of formats. If you peruse github, the key characteristics
are, I think: that they are text formats because they are easy to version
with source and are hackable; and mostly they dump to HTML. PDFs are
very rare these days.

It would be fun to see what the most used are. Markdown is a big
contender, as we as language specific formats (python and
reStructuredText for example).

I don't believe that HTML is a good authoring format any more than PDF
is. I don't think see this as huge problem. HTML needs to be part of the
tool-chain, not all of it.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

Luca Matteis  writes:

> On Sun, Oct 5, 2014 at 4:34 PM, Ivan Herman  wrote:
>> The real problem is still the missing tooling. Authors, even if technically
>> savy like this community, want to do what they set up to do: write their
>> papers as quickly as possible. They do not want to spend their time going
>> through some esoteric CSS massaging, for example. Let us face it: we are not
>> yet there. The tools for authoring are still very poor.
>
> But are they still very poor? I mean, I think there are more tools for
> rendering HTML than there are for rendering Latex. In fact there are
> probably more tools for rendering HTML than anything else out there,
> because HTML is used more than anything else. Because HTML powers the
> Web!
>
> You can write in Word, and export in HTML. You can write in Markdown
> and export in HTML. You can probably write in Latex and export in HTML
> as well :)

Yes, you can. Most of the publishers use XML at some point in their
process, and latex gets exported to that.

I am quite happy to keep LaTeX as a user interface, because it's very
nice, and the tools for it are mature for academic documents
(in practice, this means cross-referencing and bibliographies).

So, as well as providing a LNCS stylesheet, we'd need a htlatex cf.cfg,
and one CSS and it's done. Be good to have another CSS for on-screen
viewing; LNCS's back of a postage stamp is very poor for that.

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread Phillip Lord

"Peter F. Patel-Schneider"  writes:

> One problem with allowing HTML submission is ensuring that reviewers can
> correctly view the submission as the authors intended it to be viewed.  How
> would you feel if your paper was rejected because one of the reviewers could
> not view portions of it?  At least with PDF there is a reasonably good chance
> that every paper can be correctly viewed by all its reviewers, even if they
> have to print it out.  I don't think that the same claim can be made for
> HTML-based systems.

I don't think this is a valid point. It is certainly possible to write
HTML that will not be look good on every machine, but these days, it is
easier to write HTML that does.

The same is true with PDF. Font problems used to be routine. And, as
other people have said, it's very hard to write a PDF that looks good on
anything other than paper.

> Further, why should there be any technical preference for HTML at all?  (Yes,
> HTML is an open standard and PDF is a closed one, but is there anything else
> besides that?)  Web conference vitally use the web in their reviewing and
> publishing processes.  Doesn't that show their allegiance to the web?  Would
> the use of HTML make a conference more webby?

PDF is, I think, open these days. But, yes, I do think that conferences
should dog food. I mean, what would you think if W3C produced all of
their documents in PDF? Would that make sense?

Phil

Re: scientific publishing process (was Re: Cost and access)

2014-10-03 Thread Phillip Lord



In my opinion, the opposite is true. PDF I almost always end up printing
out. This isn't the point though. 

Necessity is the mother of invention. In the ideal world, a web
conference would allow only HTML submission. Failing that, at least HTML
submission. But, currently, we cannot submit HTML at all. What is the
point of creating a better method, if we can't use it?

The only argument that seems at all plausible to me is, well, we've
always done it like this, and it's too much effort to change. I could
appreciate that.

Anyway, the argument is going round in circles.

"Peter F. Patel-Schneider"  writes:

> In my opinion PDF is currently the clear winner over HTML in both the ability
> to produce readable documents and the ability to display readable documents in
> the way that the author wants them to display.  In the past I have tried
> various means to produce good-looking HTML and I've always gone back to a
> setup that produces PDF.  If a document is available in both HTML and PDF I
> almost always choose to view it in PDF.  This is the case even though I have
> particular preferences in how I view documents.
>
> If someone wants to change the format of conference submissions, then they are
> going to have to cater to the preferences of authors, like me, and reviewers,
> like me.  If someone wants to change the format of conference papers, then
> they are going to have to cater to the preferences of authors, like me,
> attendees, like me, and readers, like me.
>
> I'm all for *better* methods for preparing, submitting, reviewing, and
> publishing conference (and journal) papers.  So go ahead, create one.  But
> just saying that HTML is better than PDF in some dimension, even if it were
> true, doesn't mean that HTML is better than PDF for this purpose.
>
> So I would say that the semantic web community is saying that there are better
> formats and tools for creating, reviewing, and publishing scientific papers
> than HTML and tools that create and view HTML.  If there weren't these better
> ways then an HTML-based solution might be tenable, but why use a worse
> solution when a better one is available?
>
> peter
>
>
>
>
>
> On 10/03/2014 08:02 AM, Phillip Lord wrote:
> [...]
>>
>> As it stands, the only statement that the semantic web community are
>> making is that web formats are too poor for scientific usage.
> [...]
>>
>> Phil
>>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Cost and access (Was Re: [ESWC 2015] First Call for Paper)

2014-10-03 Thread Phillip Lord

Eric Prud'hommeaux  writes:

> Let's work through the requirements and a plausible migration plan. We need:
>
> 1 persistent storage: it's hard to beat books for a feeling of persistence.
> Contracts with trusted archival institutions can help but we might also
> want some assurances that the protocols and formats will persist as well.

In my area, the majority of journals aren't printed; I've thrown away
conference proceedings the last decade anyway.

Protocols and formats, yes, true a problem. I think in an argument
between HTML and PDF, then it's hard to see one has the advantage over
another. My experience is that HTML is easier to extract text from,
which is always going to be base line.

For what it is worth, there are achiving solutions, including
archive.org and arxiv.org both of which leap to mind.

> 2 impact factor: i have the impression that conventional publishers have a
> bit of a monopoly and and sudden disruption would be hard to engineer. How
> do to get leading researchers to devote their work in some new crackpot
> e-journal to the exclusion of other articles which will earn them more
> points towards tenure and grants? Perhaps the answer is slowly build the
> impact factor; perhaps it's some sort of revolution in the minds of
> administrators and funders.

This is true. So, if the reason that ESWC and ISWC only accept papers in
PDF is because we need LNCS for tenure and that they will only take PDF,
it would be good to have a public statement about this.

As it stands, the only statement that the semantic web community are
making is that web formats are too poor for scientific usage.

> I work towards a network of actionable data just like the rest of you so I
> don't want to discourage this conversation; I just want to focus it.

Okay. I would like to know who made the decision that HTML is not
acceptable and why.

Phil

Re: Formats and icing (Was Re: [ESWC 2015] First Call for Paper)

2014-10-03 Thread Phillip Lord

Luca Matteis  writes:
> I'd like to say that I'm an HTML/CSS/JavaScript aficionado so I'd be
> the first to embrace Web standards to produce publications. I'm simply
> playing a bit of the devil's advocate here because I think that Latex
> is still more mature than HTML for writing papers. However, I must
> admit I'd like to see a future where that is different.

The conference does not want latex, it wants PDF. So write your
documents in latex, publish in HTML. The only thing that needs to change
are the tools in the middle.

> But before we ask conferences to embrace this still immature HTML
> world (at least for producing papers) we must write the frameworks,
> the libraries, the CSS templates that enable the same level of
> publication that Latex enables.

Well, that's already been done. As for "the same level of publication" I
profoundly disagree. LNCS format is very poor for anything other than
printing. I want a form of publication that allows me, the reader, to
switch layout.

> For solving the browser inconsistencies, standalone tools (based on a
> Webkit engine for example) must be built that produce a consistent
> printable layout no matter the operating system (browser fonts render
> differently on Mac/Windows/Linux).

Seriously? You want to build another browser. My experience is that the
web is more consistent than PDF. Font problems with PDFs used to be the
norm. Tend not to use them now, so perhaps that's changed.

And, again, printable? At least some of us want to move away from that.
Stlying in reader issue, not an authorial one.

> So yes, we can get there, but there's some work to be done to prove
> that HTML is up for task. 

No. There is work to be done to prove that we can break the habit of a
lifetime. HTML is far from immature. We move, and then we fix any
problems that we may have. Why would we bother before?

> Phillip Lord, by interactions I don't mean simple animations, I mean
> this: http://worrydream.com/LadderOfAbstraction/ - use the right side
> scrolling to instantly see the output given different inputs. That's
> powerful stuff.

Colour figures and animations would be a nice start though.

Phil

Re: [ESWC 2015] First Call for Paper

2014-10-02 Thread Phillip Lord


My library will not know this. So, they will ask me to submit a PDF (oh
dear) of the final paper to our eprints archive. Extra work, as I said.


John Domingue  writes:

> As well as being irritating, UK academics submitting to ESWC run the
> risk that their papers will not be open to REF submission; even if they
> are, we have to go to additional efforts to ensure they are green OA
> published. This is also true of ISWC which makes the semantic web a
> pretty unattractive area to do research in.
>
>
> for both ISWC and ESWC the PDFs are freely available e.g. see [1]
>
> John
>
> [1] http://2014.eswc-conferences.org/program/accepted-papers
>
>
> On 2 Oct 2014, at 12:23, Phillip Lord 
> mailto:phillip.l...@newcastle.ac.uk>> wrote:
>
> Sarven Capadisli mailto:i...@csarven.ca>> writes:
>
> On 2014-10-01 13:36, Mauro Dragoni wrote:
> Papers should not exceed fifteen (15) pages in length and must be
> formatted according to the guidelines for LNCS authors. Papers must be
> submitted in PDF (Adobe's Portable Document Format) format.
>
> As I understand it, there is a disconnect between the submission format and
> what ESWC wishes to achieve or encourage [1].
>
> Can someone please elaborate on how forcing researchers to use PDF to share
> their publicly funded knowledge instead of SW/LD technologies and tools better
> fulfills [1], or perhaps even contributes towards the Semantic Web "vision"?
>
> I would like to better discover and use SW research knowledge. ESWC
> encouraging and promoting PDF for knowledge sharing sets an unnecessary limit
> on discovery and use.
>
> Will you consider encouraging the use of Semantic Web / Linked Data
> technologies for Extended "Semantic Web" Conference paper submissions?
>
>
> Additionally, submission is to a closed access publisher, requiring us
> to sign our copyright away in return for, well, nothing.
>
> As well as being irritating, UK academics submitting to ESWC run the
> risk that their papers will not be open to REF submission; even if they
> are, we have to go to additional efforts to ensure they are green OA
> published. This is also true of ISWC which makes the semantic web a
> pretty unattractive area to do research in.
>
> Can we dump both Springer and PDF please?
>
> Phil
>
>
> _
> Deputy Director, Knowledge Media Institute, The Open University
> Walton Hall, Milton Keynes, MK7 6AA, UK
> phone: 0044 1908 653800, fax: 0044 1908 653169
> email: john.domin...@open.ac.uk<mailto:j.b.domin...@open.ac.uk> web:
> kmi.open.ac.uk/people/domingue/<http://kmi.open.ac.uk/people/domingue/>
>
> President, STI International
> Amerlingstrasse 19/35, Austria - 1060 Vienna
> phone: 0043 1 23 64 002 - 16, fax: 0043 1 23 64 002-99
> email: john.domin...@sti2.org<mailto:john.domin...@sti2.org>  web: 
> www.sti2.org<http://www.sti2.org/>
>
>
>
>
>
>
>
>
>
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt
> charity in England & Wales and a charity registered in Scotland (SC 038302).
> The Open University is authorised and regulated by the Financial Conduct
> Authority.

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Formats and icing (Was Re: [ESWC 2015] First Call for Paper)

2014-10-02 Thread Phillip Lord

Luca Matteis  writes
> So until we start building interactive publications, 
> I see no reason to move away from the
> wonders that Latex/PDF can accomplish. 


Because PDF is rubbish on the web.

Because almost all of the software tools for data visualisation are
being written in JS these days.

Because PDF is hard to extract from.

Because embedding metadata is easy in HTML.

Because, we do not make interactive publications because the technology
we are using is antiquated and does not let us, not because we do not
want to. I've even have journals try to charge me extra for colour.

And, besides, we are making interactive publications. The bioinformatics
community do this all the time. Often with data, and downloadable VMs so
you can rerun the analysis.

>> Maybe. But, that's totally backwards, IMO.
>
> But why is it backwards? We have different formats serving different
> purposes. Diversity is healthy. Simply because PDF is not in the Web
> stack it doesn't make it Web-unfriendly.

Yes, actually, it does.

Phil

Re: [ESWC 2015] First Call for Paper

2014-10-02 Thread Phillip Lord

Sarven Capadisli  writes:

> On 2014-10-01 13:36, Mauro Dragoni wrote:
>> Papers should not exceed fifteen (15) pages in length and must be
>> formatted according to the guidelines for LNCS authors. Papers must be
>> submitted in PDF (Adobe's Portable Document Format) format.
>
> As I understand it, there is a disconnect between the submission format and
> what ESWC wishes to achieve or encourage [1].
>
> Can someone please elaborate on how forcing researchers to use PDF to share
> their publicly funded knowledge instead of SW/LD technologies and tools better
> fulfills [1], or perhaps even contributes towards the Semantic Web "vision"?
>
> I would like to better discover and use SW research knowledge. ESWC
> encouraging and promoting PDF for knowledge sharing sets an unnecessary limit
> on discovery and use.
>
> Will you consider encouraging the use of Semantic Web / Linked Data
> technologies for Extended "Semantic Web" Conference paper submissions?

Additionally, submission is to a closed access publisher, requiring us
to sign our copyright away in return for, well, nothing.

As well as being irritating, UK academics submitting to ESWC run the
risk that their papers will not be open to REF submission; even if they
are, we have to go to additional efforts to ensure they are green OA
published. This is also true of ISWC which makes the semantic web a
pretty unattractive area to do research in.

Can we dump both Springer and PDF please?

Phil

Re: Fact ranking game (creation of ground truth)

2014-08-14 Thread Phillip Lord


Any chance of a registration free demonstrator?

Phil

"Bobic, Tamara"  writes:
> Here you will find our tool<http://s16a.org/fr/> that is used to rank facts 
> about ~500 popular entities from Wikipedia.
>
> You have to register with the tool and then the task will be explained to you
> in detail. You might interrupt your rating of the presented facts any time you
> like and continue later. To make it a bit more interesting, you will be able
> to score points and see your ranking in a highscore list.
>
>
> We would really appreciate your help in this task. Please do also
> spread the word. The more participants, the more valid our ground
> truth will be.
>
>
> Thanks and best regards,
>
> Semantic Technologies Team
>
> --
> Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
> Prof.-Dr.-Helmert-Str. 2-3
> D-14482 Potsdam
> Germany
> http://www.hpi.de

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: How to Find Ontologies and Data for Reuse

2014-04-01 Thread Phillip Lord

Arvind Padmanabhan  writes:
> In the spirit of reuse, is there a tool I can use to find applicable
> sources? I am starting of with Indian Classical Music but in future I will
> extend this work to all data pertaining to India.


There are lots of ways, but google is probably as good as any

For instance:

https://www.google.co.uk/search?q=raga+ontology

Got me three good links.

Phil

Re: Representing NULL in RDF

2013-06-03 Thread Phillip Lord

Jan Michelfeit  writes:
> I was doing some comparison of relational databases and Linked Data and ran
> into the problem of representing an equivalent of database NULL in RDF.
>
> I was surprised I haven't found any material or discussion on this topic
> (found only [1]) - is there some?. I believe it would be beneficial if this
> question was answered somewhere for future reference. I started a question on
> Stack Overflow [2] where I think it will be easier to discover and so that
> this list won't get polluted.
>
> I'm aware of the open world assumption in RDF, but NULL or a missing value can
> have several interpretations, for example:
>
> - value not applicable (the attribute does not exist or make sense in the 
> context)
> - value uknown (it should be there but the source doesn't know it)
> - value doesn't exist (e.g. year of death for a person alive)
> - value is witheld (access not allowed)
>
> I would like to known whether there is some *standard or generally accepted*
> way of distinguishing these cases. If you have an answer, please put it on
> [2], is possible.

It's a little unclear what you could do with this. 

Say, for example, you assert that a value is withheld. Then I assert the
value. With a database, this cannot happen; either a value if null or it
is not.

Value unknown is easy. Just don't say anything. 

Value not applicable and doesn't exist, given your examples, seem the
same to me. And depends on what bit of RDF you are using. You can, for
example, assert that anything with a year of death is necessarily a dead
person. Asserting a death date for a life person will result in
contradictory information that you have to deal with in some way.

Phil

[ANN] tawny-owl 0.11

2013-05-23 Thread Phillip Lord


I'm pleased to announce the release of tawny-owl 0.11. 

What is it?
==

This package allows users to construct OWL ontologies in a fully programmatic
environment, namely Clojure. This means the user can take advantage of
programmatic language to automate and abstract the ontology over the
development process; also, rather than requiring the creation of ontology
specific development environments, a normal programming IDE can be used;
finally, a human readable text format means that we can integrate with the
standard tooling for versioning and distributed development.

Changes
===

# 0.11

## New features

- facts on individual are now supported
- documentation has been greatly extended
- OWL API 3.4.4


A new paper on the motivation and use cases for tawny-owl is also
available at http://www.russet.org.uk/blog/2366

https://github.com/phillord/tawny-owl

Feedback welcome!


-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Is science on sale this week?

2013-05-16 Thread Phillip Lord



With no license data on them. Still, this is a good thing. All you need
to do now is to take them off Springer Link, so no one gets confused,
and we have a sensible conference publication system.

Phil

Pavel Klinov  writes:

> The key is "outside Springer Link". The ISWC 2012 (research) proceedings
> are freely available:
>
> http://iswc2012.semanticweb.org/research-papers
>
> Cheers,
>
> Pavel
>
>
> On Tue, May 14, 2013 at 11:32 PM, Alexander Garcia Castro <
> alexgarc...@gmail.com> wrote:
>
>> Daniel, I may not be understainding this but it seems that ISWC2012
>> http://www.springer.com/computer/ai/book/978-3-642-35172-3 has a price
>> tag of 51.16 Euros.
>>
>> On Tue, May 14, 2013 at 6:42 PM, Daniel Schwabe 
>> wrote:
>> > All,
>> > for the record, the ISWC series publishes its proceedings freely
>> accessible on the Web outside Springer Link.
>> > I personally negotiated this right with Springer when I chaired ISWC
>> 2006.
>> > This was negotiated for the whole series, not just that year.
>> > This was backed by SWSA, who promotes the conference series, as they all
>> agreed it would not make sense for a conference like ISWC not to have open,
>> freely accessible proceedings online.
>> > This has always been the case for the WWW conference series as well.
>> >
>> > Cheers
>> > Daniel
>> > ---
>> > Daniel Schwabe  Dept. de Informatica, PUC-Rio
>> > Tel:+55-21-3527 1500 r. 4356R. M. de S. Vicente, 225
>> > Fax: +55-21-3527 1530   Rio de Janeiro, RJ 22453-900, Brasil
>> > http://www.inf.puc-rio.br/~dschwabe
>> > On May 14, 2013, at 06:12  - 14/05/13, 
>> > phillip.l...@newcastle.ac.uk(Phillip Lord) wrote:
>> >
>> >>
>> >> ISWC and ESWC are a particular problem because they are both Springer. I
>> >> pulled my paper from publication last year, as they would not do an open
>> >> access option.
>> >>
>> >> So, with the situation as it stands, I cannot publish any semantic web
>> >> research in either of these two conferences.
>> >>
>> >> Phil
>> >>
>> >> Alexander Garcia Castro  writes:
>> >>
>> >>> conferences are important on their own. for instance, right now the
>> >>> ISWC is an important conference regardless of the publisher of the
>> >>> proceedings. if I wanted to get the 2012 proceedings I may have to pay
>> >>> (http://www.springer.com/computer/ai/book/978-3-642-35172-3). do
>> >>> publishers pay the ISWC organizers for the right to publish the
>> >>> proceedings? I mean, as things are now the ISWC brings people to
>> >>> springer, not the other way around.
>> >>>
>> >>> On Mon, May 13, 2013 at 5:34 PM, Leon Derczynski 
>> wrote:
>> >>>> Reliable dissemination.
>> >>>>
>> >>>> CEUR-WS, ACL Anthology et al. do a valuable, critical job.
>> >>>>
>> >>>>
>> >>>> On 13 May 2013 17:25, Sarven Capadisli  wrote:
>> >>>>>
>> >>>>> Hi!
>> >>>>>
>> >>>>> If we subscribe to science, free and open access to knowledge,
>> what's the
>> >>>>> purpose of the arrangement between conferences and publishers?
>> >>>>>
>> >>>>> -Sarven
>> >>>>> http://csarven.ca/#i
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Leon R A Derczynski
>> >>>> Research Associate, NLP Group
>> >>>>
>> >>>> Department of Computer Science
>> >>>> University of Sheffield
>> >>>> Regent Court, 211 Portobello
>> >>>> Sheffield S1 4DP, UK
>> >>>>
>> >>>> +45 5157 4948
>> >>>> http://www.dcs.shef.ac.uk/~leon/
>> >>>>
>> >>>> --
>> >>>> You received this message because you are subscribed to the Google
>> Groups
>> >>>> "Beyond the PDF" group.
>> >>>> To unsubscribe from this group and stop receiving emails from it,
>> send an
>> >>>> email to beyond-the-pdf+unsubscr...@googlegroups.com.
>> >>>> For more options, visit https

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord

rebholz/ebi  writes:
> Apart from the scientific work around a journal (gathering papers,
> distributing to reviewers, reviewing, decision making) there is other work:
> proof reading, layout issues and also marketing the journal.  Usually,
> acadmics are not so fond of this part of the work.

If the journal disappears then it doesn't need marketing. Scientists are
more than fond of marketing their own work. Please see here 

 http://www.russet.org.uk/blog/2054

where I don't discuss this issue, but wanted to tell you about anyway.

Phil

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord

Christian Chiarcos  writes:
> (4) One should probably ask someone from publication business for
> confirmation, but in my understanding, arxive.org serves as a *pre*print
> server, and if your contract gives you (or your contributors) the right to
> make private copies available online, there is no legal way from preventing
> you from publishing your draft papers there.

arxiv.org is a "preprint" server because academics were worried that
this would count as prior publication, and then prevent them from
getting a publication in a tree-ware journal, thus loosing them academic
brownie points.

The contracts with publishers are often quite specific about what you
are allowed to do with their content (that you wrote), and talk about
"your personal website". This may, or may not, include arxiv.

> (8) A better solution would be a free, community-maintained portal where
> researchers are allowed to publish for a minimal fee (or no fee at all).
> But there is no such thing as a free lunch, and long-term sustainability of
> this platform for the next, say, 100 years, needs to be secured *somehow*.
> So, it represents a considerable financial load. 

It's called arxiv, and it represents a load of $7 per paper. Publishers
do not offer long-term sustainability; it is the libraries that offer
this.

> Just my two (well, eight) cents ;) To sum it up: At the moment, the
> double-publication strategy of free drafts online plus commercial final
> publication (resp., open-access proceedings and commercial postproceedings)
> seems to offer the best of both worlds, and depending on your publisher and
> your contract, it should be possible to do so in a legally proper way
> already at the moment.

Deeply confused. What is the "best" we are getting from the commercial
postproceedings? And how does having two copies of every paper around
help?

Phil

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord



Dump Springer, and just publish the results on arXiv. If ESWC cannot
organise a conference at 800 Euro a pop, without cash from Springer,
then perhaps they should try getting a cheaper venue.

Better still, let's separate out the committees, the publication, and
the conference. The committees can look at papers, they can all be
published on arxiv. And people who want can go to the conference.

Phil


Alexander Garcia Castro  writes:
> the question is simple. both, eswc and iswc are prominent conferences
> because of a serious review process, a well structured set of
> committees working hard at the time of organization... but most of
> all, because we the community have accepted both conferences to be
> important. this will not change. so my point is, are publishers
> contributing with money,  serious money, to the organization of the
> conferences? how are they buying and how are we selling the
> publication rights? If tomorrow there were no springer how much would
> that affect the finances of eswc and iswc? substantially? why not
> enforcing an open publication policy for iswc and eswc? why not
> selling publication rights as a bidding process?
>
> On Tue, May 14, 2013 at 11:16 AM, Rowe, Matthew  
> wrote:
>> As authors of accepted papers, don't we have the right to disseminate our
>> work as a pre-prints anyway? I just put mine online anyway, and always have
>> done (and will do) for people to download and read.
>>
>> Matthew
>>
>> On 14 May 2013, at 10:12, Phillip Lord wrote:
>>
>>>
>>> ISWC and ESWC are a particular problem because they are both Springer. I
>>> pulled my paper from publication last year, as they would not do an open
>>> access option.
>>>
>>> So, with the situation as it stands, I cannot publish any semantic web
>>> research in either of these two conferences.
>>>
>>> Phil
>>>
>>> Alexander Garcia Castro  writes:
>>>
>>>> conferences are important on their own. for instance, right now the
>>>> ISWC is an important conference regardless of the publisher of the
>>>> proceedings. if I wanted to get the 2012 proceedings I may have to pay
>>>> (http://www.springer.com/computer/ai/book/978-3-642-35172-3). do
>>>> publishers pay the ISWC organizers for the right to publish the
>>>> proceedings? I mean, as things are now the ISWC brings people to
>>>> springer, not the other way around.
>>>>
>>>> On Mon, May 13, 2013 at 5:34 PM, Leon Derczynski  
>>>> wrote:
>>>>> Reliable dissemination.
>>>>>
>>>>> CEUR-WS, ACL Anthology et al. do a valuable, critical job.
>>>>>
>>>>>
>>>>> On 13 May 2013 17:25, Sarven Capadisli  wrote:
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> If we subscribe to science, free and open access to knowledge, what's the
>>>>>> purpose of the arrangement between conferences and publishers?
>>>>>>
>>>>>> -Sarven
>>>>>> http://csarven.ca/#i
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Leon R A Derczynski
>>>>> Research Associate, NLP Group
>>>>>
>>>>> Department of Computer Science
>>>>> University of Sheffield
>>>>> Regent Court, 211 Portobello
>>>>> Sheffield S1 4DP, UK
>>>>>
>>>>> +45 5157 4948
>>>>> http://www.dcs.shef.ac.uk/~leon/
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google Groups
>>>>> "Beyond the PDF" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>>> email to beyond-the-pdf+unsubscr...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Alexander Garcia
>>>> http://www.alexandergarcia.name/
>>>> http://www.usefilm.com/photographer/75943.html
>>>> http://www.linkedin.com/in/alexgarciac
>>>
>>> --
>>> Phillip Lord,   Phone: +44 (0) 191 222 7827
>>> Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
>>> School of Computing Science,
>>> http://homepages.cs.ncl.ac.uk/phillip.lord
>>> Room 914 Claremont Tower,   skype: russet_apples
>>> Newcastle University,   twitter: phillord
>>> NE1 7RU
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "Beyond the PDF" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to beyond-the-pdf+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord



Perhaps, although if you have given away your copyright, they could
remove this from you at any time, with no justification. And I don't
have the right to download your work, analyse it, and generate aggregate
data sets on the basis of this.

I published my paper in Future Internet. In the future, any papers that
get accepted to ESWC and ISWC will just go onto arXiv -- this is
assuming you can submit *without* agreeing to publish.

Phil

"Rowe, Matthew"  writes:

> As authors of accepted papers, don't we have the right to disseminate our work
> as a pre-prints anyway? I just put mine online anyway, and always have done
> (and will do) for people to download and read.
>
> Matthew
>
> On 14 May 2013, at 10:12, Phillip Lord wrote:
>
>> 
>> ISWC and ESWC are a particular problem because they are both Springer. I
>> pulled my paper from publication last year, as they would not do an open
>> access option.
>> 
>> So, with the situation as it stands, I cannot publish any semantic web
>> research in either of these two conferences.
>> 
>> Phil
>> 
>> Alexander Garcia Castro  writes:
>> 
>>> conferences are important on their own. for instance, right now the
>>> ISWC is an important conference regardless of the publisher of the
>>> proceedings. if I wanted to get the 2012 proceedings I may have to pay
>>> (http://www.springer.com/computer/ai/book/978-3-642-35172-3). do
>>> publishers pay the ISWC organizers for the right to publish the
>>> proceedings? I mean, as things are now the ISWC brings people to
>>> springer, not the other way around.
>>> 
>>> On Mon, May 13, 2013 at 5:34 PM, Leon Derczynski  
>>> wrote:
>>>> Reliable dissemination.
>>>> 
>>>> CEUR-WS, ACL Anthology et al. do a valuable, critical job.
>>>> 
>>>> 
>>>> On 13 May 2013 17:25, Sarven Capadisli  wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> If we subscribe to science, free and open access to knowledge, what's the
>>>>> purpose of the arrangement between conferences and publishers?
>>>>> 
>>>>> -Sarven
>>>>> http://csarven.ca/#i
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Leon R A Derczynski
>>>> Research Associate, NLP Group
>>>> 
>>>> Department of Computer Science
>>>> University of Sheffield
>>>> Regent Court, 211 Portobello
>>>> Sheffield S1 4DP, UK
>>>> 
>>>> +45 5157 4948
>>>> http://www.dcs.shef.ac.uk/~leon/
>>>> 
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "Beyond the PDF" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to beyond-the-pdf+unsubscr...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Alexander Garcia
>>> http://www.alexandergarcia.name/
>>> http://www.usefilm.com/photographer/75943.html
>>> http://www.linkedin.com/in/alexgarciac
>> 
>> -- 
>> Phillip Lord,   Phone: +44 (0) 191 222 7827
>> Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
>> School of Computing Science,
>> http://homepages.cs.ncl.ac.uk/phillip.lord
>> Room 914 Claremont Tower,   skype: russet_apples
>> Newcastle University,   twitter: phillord
>> NE1 7RU 
>> 

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord


ISWC and ESWC are a particular problem because they are both Springer. I
pulled my paper from publication last year, as they would not do an open
access option.

So, with the situation as it stands, I cannot publish any semantic web
research in either of these two conferences.

Phil

Alexander Garcia Castro  writes:

> conferences are important on their own. for instance, right now the
> ISWC is an important conference regardless of the publisher of the
> proceedings. if I wanted to get the 2012 proceedings I may have to pay
> (http://www.springer.com/computer/ai/book/978-3-642-35172-3). do
> publishers pay the ISWC organizers for the right to publish the
> proceedings? I mean, as things are now the ISWC brings people to
> springer, not the other way around.
>
> On Mon, May 13, 2013 at 5:34 PM, Leon Derczynski  wrote:
>> Reliable dissemination.
>>
>> CEUR-WS, ACL Anthology et al. do a valuable, critical job.
>>
>>
>> On 13 May 2013 17:25, Sarven Capadisli  wrote:
>>>
>>> Hi!
>>>
>>> If we subscribe to science, free and open access to knowledge, what's the
>>> purpose of the arrangement between conferences and publishers?
>>>
>>> -Sarven
>>> http://csarven.ca/#i
>>>
>>>
>>
>>
>>
>> --
>> Leon R A Derczynski
>> Research Associate, NLP Group
>>
>> Department of Computer Science
>> University of Sheffield
>> Regent Court, 211 Portobello
>> Sheffield S1 4DP, UK
>>
>> +45 5157 4948
>> http://www.dcs.shef.ac.uk/~leon/
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Beyond the PDF" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to beyond-the-pdf+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>
>
>
> -- 
> Alexander Garcia
> http://www.alexandergarcia.name/
> http://www.usefilm.com/photographer/75943.html
> http://www.linkedin.com/in/alexgarciac

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Is science on sale this week?

2013-05-14 Thread Phillip Lord

It's a badge of honour. It shows that you are doing something worthy,
and makes it more likely that someone will pay for you to go to the
conference.

Scientific publication ceased to be about communication years ago.

Phil

Sarven Capadisli  writes:

> Hi!
>
> If we subscribe to science, free and open access to knowledge, what's the
> purpose of the arrangement between conferences and publishers?
>
> -Sarven
> http://csarven.ca/#i
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Final CFP: In-Use Track ISWC 2013

2013-05-10 Thread Phillip Lord

Norman Gray  writes:
>> Norman Gray  writes:
 I am not completely familiar with DOI. Am I right, that it more or less
 provides the same service as http://purl.org .
 DOI links on the resource-level. You would still need frag ids to link to 
 parts.
 Firefox can actually handle this:
 http://dx.doi.org/10.1038%2Fscientificamerican1210-80#atl
>>> 
>>> It's not the same thing as purl.org.
>> 
>> The mechanism by which DOIs and purls are resolved is more or less
>> identical. Under the hood, DOIs use handles, purl.org uses a triple
>> store. In practice, users don't interact with either directly.
>
> Well, yes and no. The distinction I was thinking about was that PURLs are
> _defined_ in terms of an HTTP redirect (the triple store behind it is an
> implementation detail), whereas DOIs are defined in terms of the underlying,
> distributed, Handle system. There, the dx.doi.org URL is 'just' a convenience
> layer on top of the 'real' API.

They are, indeed, although as far as I can see, in general, the only
people who interact with the handle system are people like crossref and
datacite. So, while I accept this point, I think it doesn't make any
difference.

> I don't think this is just a quibble, because this, plus the different
> sustainability model, effectively gives the DOIs different persistence
> properties from PURLs. Whether those different properties are _practically_
> different is of course a different question. Myself, I'm broadly doubtful that
> there's a massive practical difference; but although I'm unpersuaded by it, I
> can see the force of the argument that the DOI sustainability model is of
> crucial importance.

I think, here, you need to separate the organisational details from the
technical ones. DOIs are all run by the DOI foundation. But there are 8
different registration authorities, and they have different models. None
the less, there is a degree to which the DOI comes in built with a
social contract.

PURLs, on the other hand, do not. So, the PURLz server at www.purl.org,
the one at purl.bioontology.org and the one that I run on my local
machine so I can play with it, all have very different contracts.

>From the social point of view, comparing DOI with PURL doesn't make
sense. You need to compare, DOIs from mEDRA, with the PURL server at
purl.bioontology.org. 

> The other argument for DOIs is that 'http:' refers to a transport protocol,
> which is being hijacked as an identifier scheme, and will presumably be
> replaced by whatever replaces HTTP over the coming decades. I think this
> argument, also, is initially attractive but unpersuasive in detail, but it
> doesn't even arise for 'doi:', which is an identifier scheme by definition.

Also, I agree unpersuasive. First, the standard guidelines for display
of (CrossRef) dois now says "http://dx.doi.org/10.";; so even if this
is widely ignored, and change of http:// to pantp:// (Phil's
all-new-transport-protocol) would affect this also. Second, because if
http: ever changes and becomes less popular, it will happen slowly and
have so many effects that a general solution would be found.

 If I am right, DOI also wouldn't be able to provide links to the 40
 million mentions contained in the Wiki links corpus:
 http://techcrunch.com/2013/03/08/google-research-releases-wikilinks-corpus-with-40m-mentions-and-3m-entities/
 That's 40 million DOIs 
>>> 
>>> I don't there would be such DOIs, unless someone has spent quite a lot of 
>>> money registering them.
>> 
>> 
>> A purl would be much better in this case anyway, since purls support
>> partial redirection, which to my knowledge, DOIs do not. With DOIs you
>> would need 40 million DOIs. With purls, you would create a single
>> partial redirect purl and handle the rest locally.
>
> I've been on the fringes of Datacite discussions, so don't know the fully
> up-to-date details, but I believe that one of the use-cases, in discussions
> about the pricing structure, is the case where someone _does_ want to register
> millions of DOIs per year (or billions: what about a DOI for every LHC
> event?). I _think_ the resolution to the 40M DOIs question is "don't do that,
> then", but the question has crossed the Datacite people's minds, and the
> different Datacite registries have (I understand) different pricing models for
> different DOI volumes.

Or an effectively infinite number of DOIs -- you can do this with PURLs,
but not with DOIs. At this scale, DOIs do not work, because they are
preregistered. PURLs do just fine, since with a partial redirect and a
deterministic algorithm, you can create them lazily.

Phil

Re: Final CFP: In-Use Track ISWC 2013

2013-05-09 Thread Phillip Lord

Norman Gray  writes:
>> I am not completely familiar with DOI. Am I right, that it more or less
>> provides the same service as http://purl.org .
>> DOI links on the resource-level. You would still need frag ids to link to 
>> parts.
>> Firefox can actually handle this:
>> http://dx.doi.org/10.1038%2Fscientificamerican1210-80#atl
>
> It's not the same thing as purl.org.

The mechanism by which DOIs and purls are resolved is more or less
identical. Under the hood, DOIs use handles, purl.org uses a triple
store. In practice, users don't interact with either directly.

> A DOI (parsed as "digital (object identifier)") is an opaque ID for an object
> of some time, which you look up in a distributed registry of resources. Thus
> your example of doi:10.1038/scientificamerican1210-80 is a name for that
> article. DOIs can also be looked up using the dx.doi.org service, but that's
> just a convenience interface to the underlying API, which is based on the
> broader-remit Handle system. Since there are no fragment IDs defined in the
> doi: URI scheme (as far as I recall), there's no meaning can be attached to
> the fragment in the dx.doi.org HTTP URI.
>
> It's also -- I _think_ -- not specified what precisely it is that the DOI 
> denotes.

This is the same as purls, after the DNS part of the system. Of course,
anyone can set up a new purl server; and the domain name of this depends
on DNS. Strictly this is not true of DOIs, although it is true of DOI
URIs (http://dx.doi.org/10.)

> The other big difference is that DOIs cost actual money, of the order of
> $1/DOI, though there's lots of variation. This is the sustainability model for
> DOIs: if one registry disappears, others can take over.
>
> The most common objects which are given DOIs are journal articles, of course,
> but there's currently a lot of effort going into the detailed mechanics of how
> you acquire a DOI for a dataset, what precisely that means, and what the cost
> model should be for registering DOIs in this context and in these numbers. See
> 

This really depends on the registration agency of which there are 8.
CrossRef DOIs for subparts can cost as little as 0.06$. DataCite DOIs
come with a different set of guarantees to CrossRefs as far as I can
see. So, CrossRef provides a guarantee of one DOI to one object, which
DataCite doesn't. I *think* datacite says "what is resolved doesn't
change", while CrossRef only says "it should maintain it's logical
identity". 

All DOIs provide metadata, although only at the Handle level. DataCite
and CrossRef DOIs also do content negotiation in the HTTP format;
unfortunately, at the HTTP level it is not possible to distinguish the
different DOIs from each other. The metadata you get back is not
entirely standardized, even between crossref and datacite.

>
> That's the edited highlights: more details at 
> 
>
>> If I am right, DOI also wouldn't be able to provide links to the 40
>> million mentions contained in the Wiki links corpus:
>> http://techcrunch.com/2013/03/08/google-research-releases-wikilinks-corpus-with-40m-mentions-and-3m-entities/
>> That's 40 million DOIs 
>
> I don't there would be such DOIs, unless someone has spent quite a lot of 
> money registering them.

A purl would be much better in this case anyway, since purls support
partial redirection, which to my knowledge, DOIs do not. With DOIs you
would need 40 million DOIs. With purls, you would create a single
partial redirect purl and handle the rest locally.

Phil

Re: Petitioning ISWC to allow Web friendly formats

2013-05-09 Thread Phillip Lord

Steve Pettifer  writes:
> This is a tempting assumption to make, especially if you come from computer
> science / maths / physics and related disciplines (as I do). But my experience
> in the life sciences is that authors do 'paint' their manuscripts by hand,
> painstakingly selecting the font and format for every bit of their document.
> Even using the 'semantic' features of wordprocessors (such as 'Heading 1') is
> something that's not commonplace.

And they never worked anyway, since two pieces of "Heading 1" text could
look complete different. I tried to use the features, but they were
useless for collaborative work as collaboraters never used them. 


> So before we get too carried away with expecting people to write HTML
> / LaTex or even markup, we'll need to take into account the working
> practises of the vast majority of academics outside of the more
> 'semantically aware' bits of science.


I think the current idea is not to expect people to write in HTML, but
to stop preventing them. 

Phil

Re: Final CFP: In-Use Track ISWC 2013

2013-05-02 Thread Phillip Lord

Sebastian Hellmann  writes:
> personally for me latex works best, because it has good editors and support
> for description logic formulas. 

Haven't tried it with DL yet, but it's worth looking at MathJax, which
allows you to drop tex mathmode directly into HTML and have it all work
nicely. 

I agree with you about the editors; I'm an Emacs head and even there,
markdown, or asciidoc mode do not stand up to auctex.

> Plus it is widely used and quite good for PDF typesetting.

And sucks on the web, which is a shame. If I could get good HTML out of
it, I would be a happy man.

Phil

Re: Publication of scientific research

2013-04-30 Thread Phillip Lord

Sarven Capadisli  writes:
> I made a simple proposal [1]. Zero direct responses. Why?
>

You proposal didn't meet the standard 12 pages in LNCS, so everyone
discarded it as having no weight. 

I like your proposal. I think it is very good. I especially like the
idea of submitting a URL. The URLs could be harvested and stored at
archive.org, which would help to address the digital preservation issue.

Phil

Re: Publication of scientific research

2013-04-29 Thread Phillip Lord

Daniel Schwabe  writes:
> On Apr 27, 2013, at 12:25  - 27/04/13, Phillip Lord 
>  wrote:
>> How about not caring how long the article is? I mean why not have guidelines
>> like "papers should be about 6 pages/3000 words, but do what you want. If
>> it's long-winded, expect to get screwed at review".
>> 
>> I just spent an hour this week removing "last-accessed: 01-01-2012" data
>> from my reference list, so I could get inside a page limit. Why? The data is
>> (slightly) valuable, so why remove it? Page limit is a tree-ware hangover. I
>> blame the conference organisers.
>
> Surprisingly, this is actually not true. In a couple of conferences where I
> was the PC Chair (including ISWC), I suggested precisely this, and the
> OVERWHELMING reaction was that people preferred page limits (and not too big
> either). The rationale was that it would then become "unfair" because some
> people would submit shorter papers than others, and possibly be at a
> disadvantage. It would eventually lead to a "race" with people submitting
> longer and longer papers, in the hopes of maximizing their chances of
> acceptance, and put an impossible burden on reviewers.

This is a clear worry; although, and let's all be honest here, how many
times have you padded a paper because, even though you had said all you
needed to, it wasn't the correct length?

The issue of longer and longer papers gets addressed simply when
reviewers realise that "TL;DR" is, infact, a perfectly valid review. Of
course, this is not to say that long papers would be ruled out; if they
were clearly written, and needed the space then reviewers wouldn't mind.
I mean no one complains that Harry Potter goes on for too long -- well,
okay, maybe number 5 did drag a bit.

>> As for CEUR, well, try asking them. I mean, an archive site, on the web,
>> can't cope with HTML? If that fails, let's put our papers on archive.org
>> which will archive this sort of stuff.
>> Or arxiv which will take HTML.
>> 
> I have nothing against being able to read a paper as HTML. However, to be
> frank, the vast majority of papers is written linearly, and do not exploit the
> hypertext nature of the Web. Of course, there are many advantages in being
> able to add meta-data, but to me this is a separate issue.

I don't really care about hypertext, per se. HTML is a good format
because the viewers are much more widely used and the formats are just
better, because they are mass media, commodity tools. Reading two column
PDFs on small screens is difficult, I think. I still print out most of
my papers, as you do. But I only do this with academic papers;
everything else I read on the web. 

The ability to start to add additional structured data to the papers is
also a good justification, although you can do this with PDF; even if no
body does.

> The most useful feature I've found when reading on a tablet is being
> able to zoom in and out, and eventually being able to follow a link.

Like on the web.

Phil

Re: Publication of scientific research

2013-04-29 Thread Phillip Lord

Andrea Splendiani  writes:
> I'm involved in the organization of a couple of conferences and workshops.
> You do need a template, as without this it's hard to have homogenous
> submissions (even for simple things as page, or html equivalent,
> length).

It's worth thinking about homogeneity. I would posit that the article is
more important than the journal or conference in this day and age.
Consider these papers:

Histone Variant H2A.Z Regulates Centromere Silencing and
Chromosome Segregation in Fission Yeast

http://www.jbc.org/content/285/3/1909.full

Centromere Silencing and Function in Fission Yeast Is Governed by the
Amino Terminus of Histone H3

http://www.sciencedirect.com/science/article/pii/S0960982203007000

WDHD1 modulates the post-transcriptional step
of the centromeric silencing pathway

http://nar.oxfordjournals.org/content/early/2011/01/25/nar.gkq1338.full

All about roughly the same thing. All with different formats and
representation. Do this really hinder my understanding of them? Or
prevent me from judging one against the other.

Phil

RE: Publication of scientific research

2013-04-27 Thread Phillip Lord

How about not caring how long the article is?  I mean why not have guidelines 
like "papers should be about 6 pages/3000 words, but do what you want. If it's 
long-winded, expect to get screwed at review".

I just spent an hour this week removing "last-accessed: 01-01-2012" data from 
my reference list, so I could get inside a page limit. Why? The data is 
(slightly) valuable, so why remove it? Page limit is a tree-ware hangover. I 
blame the conference organisers.

As for CEUR, well, try asking them. I mean, an archive site, on the web, can't 
cope with HTML? If that fails, let's put our papers on archive.org which will 
archive this sort of stuff.
Or arxiv which will take HTML.





From: Andrea Splendiani [andrea.splendi...@iscb.org]
Sent: 27 April 2013 15:41
To: Phillip Lord
Cc: Bo Ferri; public-lod@w3.org
Subject: Re: Publication of scientific research

Hi,

we could switch from asking for a given page number to some word-count range 
(perhaps with an estimate for words and figures). With a bit of template + css, 
even publication should be ok.
The main issue I can see in stopping asking for pdf is that, if you then want 
proceedings in, say, ceur, you need pdfs. Although they are not strict about 
the format so we could go there from html.
Other than this, we could indeed already start asking for different formats.

best,
Andrea

Il giorno 27/apr/2013, alle ore 13:56, Phillip Lord 
 ha scritto:

> My colleague, Allyson Lister managed to post here entire thesis onto a 
> commodity wordpress.
>
> http://themindwobbles.wordpress.com/2013/01/02/phd-thesis-table-of-contents/
>
> So, it is possible, even if it is rather unwieldy at the moment. In this 
> case, she used latextowordpress
> which in turn uses plastex. Tex always presents the problem that the only 
> thing that can sanely
> turn tex into output is tex. If you need proof of this, look up David 
> Carlisles xii.tex.
>
> I'm hoping that luatex is finally going to crack this open, because it really 
> needs to go there.
>
> The key point is the Ally tried, and from this we learnt. My own feeling is 
> that a very simple step
> would be for conferences to stop requiring PDF. I'm happy with a number of 
> tool chains for writing
> content; currently, though, when faced with "use exactly this guidelines with 
> exactly this number
> of page" directives, there is not alternative but tex (for me).
>
> 
> From: Bo Ferri [z...@smiy.org]
> Sent: 27 April 2013 13:33
> To: public-lod@w3.org
> Subject: Re: Publication of scientific research
>
> Hi all,
>
> generally, it is really interesting to follow this discussion. I've
> never been a friend of the cumbersome scientific research publishing way
> (that's why there are no real scientific publications available from
> myself - except the two theses that I wrote during my studies ;) ). So
> I'm really looking forward to a more webby for doing this since ages.
> When I wrote my final thesis in 2010/11 - I tried to publish as many
> content (section excerpts, slides, ontologies, code, ... e.g. [3, 4, 5])
> on the web from my theses as I could. For a starting point Wordpress and
> Drupal are your friends (as Ivan already suggested). With semantic
> annotation plugins á la RDFaCE [1] one could go even a step further.
> Other plugins such as Angelo's Wordpress extension [2] can deliver the
> metadata.
> However, finally I wrote the thesis itself in LaTeX and published (also
> on the Web, e.g., Mendeley or ResearchGate) as PDF, because I never
> found a good LaTeX to HTML converter that could handle my latex document
> in a satisfiable way. Fortunately, Authorea is exactly trying to do this
> for me. So I'll give it try with my theses (with so many web references
> inside ;) ) and let you know whether it is really able to handle a "real
> world" LaTeX document.
> At the end tools such as Authorea are the perfect bridge for the way we
> are trying to go now by bringing scientific research publishing to the
> present.
>
> Cheers,
>
>
> Bo
>
>
> [1] http://rdface.aksw.org/
> [2] https://github.com/angelo-v/wp-linked-data
> [3] http://smiy.org
> [4] http://purl.org/smiy/
> [5] http://zazi.smiy.org/slides/pmkb-defence/pmkb.html
>
> On 4/25/2013 8:21 AM, Herbert Van de Sompel wrote:
>>
>> There is evolution in this realm too, see e.g. https://www.authorea.com/
>>
>> Greetings
>>
>> Herbert
>
>
>

RE: Publication of scientific research

2013-04-27 Thread Phillip Lord

My colleague, Allyson Lister managed to post here entire thesis onto a 
commodity wordpress.

http://themindwobbles.wordpress.com/2013/01/02/phd-thesis-table-of-contents/

So, it is possible, even if it is rather unwieldy at the moment. In this case, 
she used latextowordpress
which in turn uses plastex. Tex always presents the problem that the only thing 
that can sanely 
turn tex into output is tex. If you need proof of this, look up David Carlisles 
xii.tex.

I'm hoping that luatex is finally going to crack this open, because it really 
needs to go there. 

The key point is the Ally tried, and from this we learnt. My own feeling is 
that a very simple step
would be for conferences to stop requiring PDF. I'm happy with a number of tool 
chains for writing 
content; currently, though, when faced with "use exactly this guidelines with 
exactly this number
of page" directives, there is not alternative but tex (for me). 

From: Bo Ferri [z...@smiy.org]
Sent: 27 April 2013 13:33
To: public-lod@w3.org
Subject: Re: Publication of scientific research

Hi all,

generally, it is really interesting to follow this discussion. I've
never been a friend of the cumbersome scientific research publishing way
(that's why there are no real scientific publications available from
myself - except the two theses that I wrote during my studies ;) ). So
I'm really looking forward to a more webby for doing this since ages.
When I wrote my final thesis in 2010/11 - I tried to publish as many
content (section excerpts, slides, ontologies, code, ... e.g. [3, 4, 5])
on the web from my theses as I could. For a starting point Wordpress and
Drupal are your friends (as Ivan already suggested). With semantic
annotation plugins á la RDFaCE [1] one could go even a step further.
Other plugins such as Angelo's Wordpress extension [2] can deliver the
metadata.
However, finally I wrote the thesis itself in LaTeX and published (also
on the Web, e.g., Mendeley or ResearchGate) as PDF, because I never
found a good LaTeX to HTML converter that could handle my latex document
in a satisfiable way. Fortunately, Authorea is exactly trying to do this
for me. So I'll give it try with my theses (with so many web references
inside ;) ) and let you know whether it is really able to handle a "real
world" LaTeX document.
At the end tools such as Authorea are the perfect bridge for the way we
are trying to go now by bringing scientific research publishing to the
present.

Cheers,

Bo

[1] http://rdface.aksw.org/
[2] https://github.com/angelo-v/wp-linked-data
[3] http://smiy.org
[4] http://purl.org/smiy/
[5] http://zazi.smiy.org/slides/pmkb-defence/pmkb.html

On 4/25/2013 8:21 AM, Herbert Van de Sompel wrote:
>
> There is evolution in this realm too, see e.g. https://www.authorea.com/
>
> Greetings
>
> Herbert

Re: Publication of scientific research

2013-04-26 Thread Phillip Lord

c language to "write data".
>>>>> But many people are not even used to a computational language at all...
>>>>> the typical interface for "data" typically being an excel spreadsheet.
>>>> 
>>>> Yes, and a spreadsheet too is an awesome tool for the "data scribbling"
>>>> patterns I am referring to. No disagreement there since, that used to be
>>>> my initial alternative to Turtle approach i.e., express RDF triples using
>>>> a spreadsheet via 3 columns by N rows.
>>>>> At the end, it's in a good part a question of tools that meet users
>>>>> typical practices.
>>>>> 
>>>>> The other good part is actually a question of incentives.
>>>>> Now we can open an historical digression on how in life sciences some
>>>>> publishers have been functional to use of public repositories for data.
>>>>> The same mechanism could work for embedding metadata (if there is a need
>>>>> or incentive, tools come).
>>>> 
>>>> Yes, discoverability via the metadata graphs the emerge from associating
>>>> out-of-band metadata with a PDF.
>>>>> 
>>>>> Yes another bit, I was just wondering: are we sure that authors embedding
>>>>> metadata in their papers is the best way to go ?
>>>> 
>>>> All they need to do is add metadata references (using Linked Data URIs) to
>>>> the citation sections :-)
>>>> 
>>>>> They surely know most about their data, but may get shorts of standards
>>>>> and even have some bias. It looks like a (modern) role for publishers
>>>>> could be to actually put order in metadata provided by users.
>>>> 
>>>> Everyone needs to participate otherwise the "egg and chicken" conundrum
>>>> stalls everything.
>>>> 
>>>> Kingsley
>>>>> 
>>>>> best,
>>>>> Andrea
>>>>> 
>>>>> 
>>>>> Il giorno 25/apr/2013, alle ore 11:57, Kingsley Idehen
>>>>>  ha scritto:
>>>>> 
>>>>>> On 4/25/13 2:05 AM, Ivan Herman wrote:
>>>>>>> As for the metadata: I think even turtle is too complicated for many
>>>>>>> (sorry Kingsley). I am not talking about the average readers of this
>>>>>>> list; I am talking about authors in other disciplines. But, if we bite
>>>>>>> the bullet and we say that papers are submitted in PDF, we could at
>>>>>>> least require to include the metadata in the PDF file. After all, the
>>>>>>> metadata is included in PDF in XMP format, which is (a slightly ugly
>>>>>>> and restricted version of) RDF/XML. It is ugly, but we have enough
>>>>>>> tools around to turn it into Turtle, or JSON-LD, or whatever.
>>>>>> Believe me, I used to believe that Turtle was too complicated for the
>>>>>> casual user. By that I mean a literate individual (in any natural
>>>>>> language) that would like to use the "scribble" approach to data
>>>>>> creation, integration, and publication.
>>>>>> 
>>>>>> The user profile I have in mind certainly isn't scoped to this or any
>>>>>> list associated with Linked Data or the the broader Semantic Web etc..
>>>>>> 
>>>>>> Prefixes and absolute URIs are the two things that create the illusion
>>>>>> of Turtle being complex.
>>>>>> 
>>>>>> I arrived at my conclusions by testing my theory against a whole range
>>>>>> of profiles - kids, teenagers, and adults.
>>>>>> 
>>>>>> Once I dropped prefixes and absolute URIs from the introduction it was
>>>>>> smooth sailing. Remember, across all natural languages underlies a form
>>>>>> of subject-predicate-object or subject-verb-object sentence structure.
>>>>>> Thus, <#this> <#relatesTo> <#that> etc.. becomes easy to understand.
>>>>>> 
>>>>>> Remember the claim I make on this very day:
>>>>>> Turtle is the key to unleashing the full potential of RDF model based
>>>>>> Linked Data that scales to the Web :-)
>>>>>> 
>>>>>> Note, HTML is too complicated [1], and that's why we don't have a fully
>>>>>>

Re: Publication of scientific research

2013-04-25 Thread Phillip Lord



It's always been done that way. Over time, I have been getting more and
more annoyed with journal submission processes, when I realise how easy
it could be.

Phil

Andrea Splendiani  writes:
> on the other hand...
> many journals offer an extremely tedious submission and formatting process.
> Scientists are usually ok with it, though it sometimes is close to 
> nonsensical.
> On the other hand, there is lot of objection to adding some metadata, that is
> only a marginal cost in term of time (or at least has a much higher value per
> time spent).
>
> I wonder why...
>
> best,
> Andrea
>
> Il giorno 25/apr/2013, alle ore 15:11, Rob Warren  
> ha scritto:
>
>> 
>> On 25-Apr-13, at 10:41 AM, Phillip Lord wrote:
>>> Scientists would rather eat their dogs than give up their favoured
>>> editing environments.
>> 
>> And chew off their own (or their RA's) foot as well.
>> 
>> Most conference submission / reviewing software already asks for the basic
>> meta-data boilerplate to help the reviewing process (authors, title,
>> affiliation, etc...) and this is manually entered before the paper is ready.
>> 
>> Why don't we generate the meta-data directly from this process and not
>> bother with the hand editing of anything? It would not be a stretch to get
>> people to submit their citations file (Bibtex, RIS, etc...) along with the
>> paper at camera ready and script the conversion to something semantic web
>> friendly?
>> 
>> This would neatly create the publications, citation and author graph in a 
>> stroke.
>> 
>>> Solution 2. Make it valuable to the authors.
>> 
>> Outcome 1: Make it valuables to the social bookmarking / citation websites
>> downstream to load directly into their systems and increase the visibility
>> of the publication.
>> 
>> -rhw
>> 
>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Publication of scientific research

2013-04-25 Thread Phillip Lord

Rob Warren  writes:

> On 25-Apr-13, at 10:41 AM, Phillip Lord wrote:
>> Scientists would rather eat their dogs than give up their favoured
>> editing environments.
>
> And chew off their own (or their RA's) foot as well.
>
> Most conference submission / reviewing software already asks for the basic
> meta-data boilerplate to help the reviewing process (authors, title,
> affiliation, etc...) and this is manually entered before the paper is ready.
>
> Why don't we generate the meta-data directly from this process and not bother
> with the hand editing of anything? 

http://www.russet.org.uk/blog/2366

The metadata on the HTML for this article comes directly from the
metadata that I entered into arXiv. 

I still prefer the situation where the metadata comes directly from the
file with the content in it.

> It would not be a stretch to get people to submit their citations file
> (Bibtex, RIS, etc...) along with the paper at camera ready and script
> the conversion to something semantic web friendly?
>
> This would neatly create the publications, citation and author graph in a
> stroke.

Yes, that would be nice.

>
>> Solution 2. Make it valuable to the authors.
>
> Outcome 1: Make it valuables to the social bookmarking / citation websites
> downstream to load directly into their systems and increase the visibility of
> the publication.


Yes. Again, the same page provides a bibtex download (which I use for
citing my own work). 

Phil

Re: Publication of scientific research

2013-04-25 Thread Phillip Lord

Sarven Capadisli  writes:
> I'll ask the community: what is the real lesson from this and how can we
> improve?

Scientists would rather eat their dogs than give up their favoured
editing environments.

Solution. Hack stuff into their favoured editing environments by any
means necessary.

Solution 2. Make it valuable to the authors.

Phil

Re: Publication of scientific research

2013-04-25 Thread Phillip Lord

You might be interested in this:

http://bio-ontologies.knowledgeblog.org/table-of-contents

These are papers from a workshop that I used to organise. The content as
you can see is in HTML and has included images and so forth. What is
perhaps less obvious is that the source data in most cases is a word
doc. All the content including the images was posted by word. We did
have to do a little reformatting (the conference template is really the
most unhelpful that it could be -- my fault, I wrote it). It takes
around 5 - 10 minutes a paper on average (there is quite a wide variance).

And more, the content has some semantic markup. The journal, publication
date, authors, and title are all clearly described in the HTML; you can
retrieve this metadata as RDF also, if you like. This metadata was not
added independently; it was present in the underlying Word doc. 

We added this by simply adding a little markup using shortcodes
([author]Phillip Lord[/author]). Of course this is entirely horrible, to
the point that a reviewer of my last grant called it a "drunk under a
lamppost idea". But it does work without requiring any modification of
word. And it works for wikipedia. Besides, nothing wrong with being
drunk under a lamppost occasionally.

It's also possible to combine the web and PDF. So, for instance, this link:

http://www.russet.org.uk/blog/2366

is my OWLED paper. In this case, the title, author, date come from
arXiv, and the abstract is transcluded from there. In short, it's an
overlay journal (article). The English summary and reviews are
independent, and subsidiary content. In this case, the knowledge comes
from arXiv where it has been added independently. I took this route
because, sad though it is to say, getting a word doc on the web is much
easier than getting a LaTeX document up.

We even have this working for CEUR-WS, although in this case, we loose
the abstracts; I would describe how we achieved this, but really, you
don't want to know.

The HTML is messy, of course, and dependent on the underlying tool. The
use of short codes is unprincipled and hideous. But it does work. And we
can add as much semantics as authors can be bothered with. Given that
the latter will be the limiting factor, I don't think it's a bad way
forward.

Phil

Hugh Glaser  writes:
> I hate PDF with a passion, by the way, but in the socio thingy of
> being an editor of a proceedings, it can be an enormous pain when
> people submit HTML that has local links to images, etc., even from MS
> Word documents.
>
> Cheers
>
> On 24 Apr 2013, at 18:23, Sarven Capadisli 
>  wrote:
>
>> On 04/24/2013 05:37 PM, Andrea Splendiani wrote:
>>> There two main issues in moving beyond pdf.
>>> 
>>> One, probably minor, is that there are larger constraints. Some
>>> people need their work to be somewhere "understood" by their
>>> organization. This is a bit less relevant for conferences than for
>>> journals, but still an issue.
>>> 
>>> The other is that some bit of a research paper can lend to
>>> formalization. But there is a lot of variability. In some case you
>>> are closer to what web languages can represent. E.g.: a finding in
>>> RDF, some algorithm shown in JavaScript,... But what is somebody is
>>> publishing a description of an information systems ? It may get so
>>> far from a standard way to talk about think that you won't gain much
>>> with a structured representation.
>>> 
>>> pdf + other technologies, when it applies, could be a good idea,
>>> though.
>> 
>> I can't quite make out the core of the issues that you are trying to 
>> describe. So, from I understand:
>> 
>> We could maybe at least give this HTML thing a try. And, later worry about 
>> semantic alignments?
>> 
>> IMHO, there is no compelling reason to research and try PDF + other
>> technologies, when we have HTML+RDF + other technologies already in place
>> and staring right at us.
>> 
>> -Sarven
>> 
>
>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Content negotiation negotiation

2013-04-24 Thread Phillip Lord



Well, more than useful I think. Content negotiation without it seems a
bit pointless.

As you say, the last doesn't, the first two do, they overlap in formats
they will return, but each have some unique ones. But I had to read the
documentation to find this out. Bit clunky.

Phil

Leigh Dodds  writes:

> The first two indicate that responses vary based on Accept header as
> both have a Vary: Accept. The third doesn't so doesn't support
> negotiation.
>
> None of the URLs advertise what formats are available. That's not a
> requirement for content-negotiation, although it'd be useful.
>
> Cheers,
>
> L.
>
>
> On Wed, Apr 24, 2013 at 2:17 PM, Phillip Lord
>  wrote:
>>
>> Hmmm.
>>
>> So, taking a look at these three URLs, can you tell me
>> a) which of these support content negotiation, and b) what formats
>> they provide.
>>
>> http://dx.doi.org/10.3390/fi4041004
>> http://dx.doi.org/10.1594/PANGAEA.527932
>> http://dx.doi.org/10.1000/182
>>
>> I tried vapor -- it seems to work by probing with application/rdf+xml,
>> but it appears to work by probing. I can't find any of the headers
>> mentioned either, although perhaps I am looking wrongly.
>>
>> Phil
>>
>>
>>
>> Hugh Glaser  writes:
>>
>>> Ah of course - thanks Mark, silly me.
>>> So I look at the Link: header for something like
>>> curl -L -i http://dbpedia.org/resource/Luton
>>> Which gives me the information I want.
>>>
>>> Anyone got any offers for how I would use Linked Data to get this into my 
>>> RDF store?
>>>
>>> So then I can do things something like:
>>> SELECT ?type ?source FROM { <http://dbpedia.org/resource/Luton> ?foo ?file .
>>> ?file ?type ?source . }
>>> (I think).
>>>
>>> I suppose it would need to actually be returned from a URI at the site - I
>>> can't get a header as URI resolution - right?
>>> And I would need an ontology?
>>>
>>> Cheers.
>>>
>>> On 23 Apr 2013, at 19:49, Mark Baker 
>>>  wrote:
>>>
>>>> On Tue, Apr 23, 2013 at 1:42 PM, Hugh Glaser  wrote:
>>>>>
>>>>> On 22 Apr 2013, at 12:18, Phillip Lord  
>>>>> wrote:
>>>>> 
>>>>>> We need to check for content negotiation; I'm not clear, though, how we
>>>>>> are supposed to know what forms of content are available. Is there
>>>>>> anyway we can tell from your website that content negotiation is
>>>>>> possible?
>>>>> Ah, and interesting question.
>>>>> I don't know of any, but maybe someone else does?
>>>>
>>>> Client-side conneg, look for Link rel=alternate headers in response
>>>>
>>>> Server-side conneg, look for "Vary: Content-Type" in response
>>>>
>>>> Mark.
>>>
>>>
>>>
>>
>> --
>> Phillip Lord,   Phone: +44 (0) 191 222 7827
>> Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
>> School of Computing Science,
>> http://homepages.cs.ncl.ac.uk/phillip.lord
>> Room 914 Claremont Tower,   skype: russet_apples
>> Newcastle University,   twitter: phillord
>> NE1 7RU
>>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Content negotiation negotiation

2013-04-24 Thread Phillip Lord


Hmmm. 

So, taking a look at these three URLs, can you tell me
a) which of these support content negotiation, and b) what formats
they provide. 

http://dx.doi.org/10.3390/fi4041004
http://dx.doi.org/10.1594/PANGAEA.527932
http://dx.doi.org/10.1000/182

I tried vapor -- it seems to work by probing with application/rdf+xml,
but it appears to work by probing. I can't find any of the headers
mentioned either, although perhaps I am looking wrongly. 

Phil



Hugh Glaser  writes:

> Ah of course - thanks Mark, silly me.
> So I look at the Link: header for something like
> curl -L -i http://dbpedia.org/resource/Luton
> Which gives me the information I want.
>
> Anyone got any offers for how I would use Linked Data to get this into my RDF 
> store?
>
> So then I can do things something like:
> SELECT ?type ?source FROM { <http://dbpedia.org/resource/Luton> ?foo ?file .
> ?file ?type ?source . }
> (I think).
>
> I suppose it would need to actually be returned from a URI at the site - I
> can't get a header as URI resolution - right?
> And I would need an ontology?
>
> Cheers.
>
> On 23 Apr 2013, at 19:49, Mark Baker 
>  wrote:
>
>> On Tue, Apr 23, 2013 at 1:42 PM, Hugh Glaser  wrote:
>>> 
>>> On 22 Apr 2013, at 12:18, Phillip Lord  wrote:
>>> 
>>>> We need to check for content negotiation; I'm not clear, though, how we
>>>> are supposed to know what forms of content are available. Is there
>>>> anyway we can tell from your website that content negotiation is
>>>> possible?
>>> Ah, and interesting question.
>>> I don't know of any, but maybe someone else does?
>> 
>> Client-side conneg, look for Link rel=alternate headers in response
>> 
>> Server-side conneg, look for "Vary: Content-Type" in response
>> 
>> Mark.
>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: predatory journals and conferences article in NY Times

2013-04-23 Thread Phillip Lord


It's high time universities stopped judging academics by *where* they
have published rather than *what*. 

We already have a form of rating for journals. It's called impact
factor. It doesn't work, because judging papers by their place of
publication is nonsensical.

Linked data and semantic web technologies provide opportunities, I
think, to handle the metadata associated with scientific publication, to
represent the knowledge in academic publications, and to do so without
the necessity for a centralised authority.

But, then I am a researcher with a metanl narrow focus, so what do I
know?

Phil

ProjectParadigm-ICT-Program  writes:
> This is a problem which manifests itself in every discipline and it preys on
> basic human needs for recognition. The current publishing world of academia
> itself is to blame partially.
>
> Because in each field of science scientists and researchers usually have a
> short list of peer-reviewed journals and conferences in their mental narrow
> focus, only librarians typically have a (often not much) better overview of
> available reputable journals and conferences in respective fields.
>
> It is high time for a global registry of scientific publishers and their
> respective journals and a form of rating and grading them.
>
> Linked data and semantic web technologies provide opportunities to create such
> rating and grading systems, and maybe an item for a separate W3C Community
> Group?
>
>  
> Milton Ponson
> GSM: +297 747 8280
> PO Box 1154, Oranjestad
> Aruba, Dutch Caribbean
> Project Paradigm: A structured approach to bringing the tools for sustainable
> development to all stakeholders worldwide by creating ICT tools for NGOs
> worldwide and: providing online access to web sites and repositories of data
> and information for sustainable development
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed. If
> you have received this email in error please notify the system manager. This
> message contains confidential information and is intended only for the
> individual named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail.

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

Re: Fwd: Publishing Wordpress contents as linked data

2013-04-22 Thread Phillip Lord

Interesting. I've been meaning to get Wordpress content negotiating for
a long time, so I shall take a look at this to see how you have done it.

You might also be interested in a couple of tools that we have written. 
Kblog-metadata also adds metadata in a variety of formats to wordpress,
with support for per post authors, container titles (and dates although
this is not released yet). 

http://wordpress.org/extend/plugins/kblog-metadata/

It has a couple of nice widgets, which can display citation information
also visibly for the author (and others), which helps to make sure it's
correct. 

We also have http://greycite.knowledgeblog.org. This will return
metadata for any URL in a variety of forms, including RDF if you wish.
So, the metadata for your page is...

http://greycite.knowledgeblog.org/rdf?uri=http%3A%2F%2Fdatenwissen.de%2F2013%2F04%2Fwordpress-bloginhalte-als-linked-data%2F

It's not as extensive as the metadata you provide; we don't return
content or foaf authorship. But we estimate that it can return metadata
of this form for around 100 million URLs; it works but provides less
metadata for considerably more URLs. I think it content-negotiates as
well. Need to check.

We weren't able to scrap that much from your page, incidentally. 

http://greycite.knowledgeblog.org/?uri=http://datenwissen.de/2013/04/wordpress-bloginhalte-als-linked-data/

We need to check for content negotiation; I'm not clear, though, how we
are supposed to know what forms of content are available. Is there
anyway we can tell from your website that content negotiation is
possible?

Phil

Angelo Veltens  writes:
> I coded a small wordpress plugin, that enables linked data publishing of
> blog post & author data.
>
> The plugin is installed on my blog http://datenwissen.de. Feel free to
> request Linked Data via "application/rdf+xml" or "text/turtle"
> Accept-Header.
>
> The data of my latest blog post in the Q&D RDF Browser:
>
> http://graphite.ecs.soton.ac.uk/browser/?uri=http%3A%2F%2Fdatenwissen.de%2F2013%2F04%2Fwordpress-bloginhalte-als-linked-data%23it
>
> Blog authors get a FOAF-Profile that I plan to extend to a fully
> functional WebID:
>
> http://graphite.ecs.soton.ac.uk/browser/?uri=http%3A%2F%2Fdatenwissen.de%2Fauthor%2Fangelo%23me#http://datenwissen.de/author/angelo#me
>
> The plugin is not yet available in the wordpress plugin repo, but on
> github: https://github.com/angelo-v/wp-linked-data
>
> Contributions and feedback are welcome.
>
> Kind regards,
> Angelo Veltens
>
>
>

-- 
Phillip Lord,   Phone: +44 (0) 191 222 7827
Lecturer in Bioinformatics, Email: phillip.l...@newcastle.ac.uk
School of Computing Science,
http://homepages.cs.ncl.ac.uk/phillip.lord
Room 914 Claremont Tower,   skype: russet_apples
Newcastle University,   twitter: phillord
NE1 7RU

60 matches

Mail list logo