I am currently trying to install ws-export (
https://github.com/wikimedia/ws-export) and I’m having trouble with
“compose”, would anyone know anything about this?

> composer install --no-dev

> Your lock file does not contain a compatible set of packages. Please run
composer update.

> composer update

> Your requirements could not be resolved to an installable set of packages.

> Problem 1
    - Root composer.json requires PHP extension ext-dom * but it is missing
from your system. Install or enable PHP's dom extension.
  Problem 2
    - Root composer.json requires PHP extension ext-intl * but it is
missing from your system. Install or enable PHP's intl extension.
  Problem 3
    - Root composer.json requires PHP extension ext-sqlite3 * but it is
missing from your system. Install or enable PHP's sqlite3 extension.
  Problem 4
    - Root composer.json requires PHP extension ext-zip * but it is missing
from your system. Install or enable PHP's zip extension.
  Problem 5
    - symfony/framework-bundle[v5.4.0, ..., v5.4.12] require ext-xml * ->
it is missing from your system. Install or enable PHP's xml extension.
    - Root composer.json requires symfony/framework-bundle 5.4.* ->
satisfiable by symfony/framework-bundle[v5.4.0, ..., v5.4.12].

To enable extensions, verify that they are enabled in your .ini files:
    - /etc/php/7.4/cli/php.ini
    - /etc/php/7.4/cli/conf.d/10-opcache.ini
    - /etc/php/7.4/cli/conf.d/10-pdo.ini
    - /etc/php/7.4/cli/conf.d/20-calendar.ini
    - /etc/php/7.4/cli/conf.d/20-ctype.ini
    - /etc/php/7.4/cli/conf.d/20-exif.ini
    - /etc/php/7.4/cli/conf.d/20-ffi.ini
    - /etc/php/7.4/cli/conf.d/20-fileinfo.ini
    - /etc/php/7.4/cli/conf.d/20-ftp.ini
    - /etc/php/7.4/cli/conf.d/20-gettext.ini
    - /etc/php/7.4/cli/conf.d/20-iconv.ini
    - /etc/php/7.4/cli/conf.d/20-json.ini
    - /etc/php/7.4/cli/conf.d/20-phar.ini
    - /etc/php/7.4/cli/conf.d/20-posix.ini
    - /etc/php/7.4/cli/conf.d/20-readline.ini
    - /etc/php/7.4/cli/conf.d/20-shmop.ini
    - /etc/php/7.4/cli/conf.d/20-sockets.ini
    - /etc/php/7.4/cli/conf.d/20-sysvmsg.ini
    - /etc/php/7.4/cli/conf.d/20-sysvsem.ini
    - /etc/php/7.4/cli/conf.d/20-sysvshm.ini
    - /etc/php/7.4/cli/conf.d/20-tokenizer.ini
You can also run `php --ini` in a terminal to see which files are used by
PHP in CLI mode.
Alternatively, you can run Composer with `--ignore-platform-req=ext-dom
--ignore-platform-req=ext-intl --ignore-platform-req=ext-sqlite3
--ignore-platform-req=ext-zip --ignore-platform-req=ext-xml` to temporarily
ignore these required extensions.



I need to install these 5 extensions? Is that really the solution?
Shouldn’t they be automatically installed?

Thank you,
Julius

On Tue 20. Sep 2022 at 17:41, Julius Hamilton <juliushamilton...@gmail.com>
wrote:

> Thank you very much.
>
> > Did you look at the wikitext of that page?
>
> I did now, I see that the text displayed is not actually present in the
> wikitext / source text. I am seeing these ".djvu include" lines:
>
> <pages index="A simplified grammar of the Swedish language.djvu" include=7
> />
>
> What is this? Is it a common format for a Wikisource book?
>
> > prop=extracts works, but I would say it's a poor fit for many (most?)
> wikisource pages.
>
> Why? Because it just pulls out sentences from the wikitext? What is
> different about the functioning of prop=revisions, for example?
>
> > Plaintext as in wikitext or in parsed html converted to plaintext?
>
> Whatever you think is preferable, the point is to have some clean,
> readable text. If the parsed HTML has any awkward formatting issues, I
> might prefer the wikitext, or vice versa. Whichever is easier to work with.
> Technically since wikitext is a markup format it might be easier to pull
> out from specific fields you are seeking? I don't know.
>
> > You could use something like this to fetch every page
>
> Thanks. I tried replacing the title with a different, more normal book and
> it didn't seem to work.
>
>
> https://en.wikisource.org/w/api.php?generator=allpages&action=query&prop=revisions&rvprop=content&rvslots=main&gapprefix=Moby-Dick_(1851)_US_edition
>
>
> I guess it's the same problem, "revisions" also pulls out wikitext but
> Wikisource wikitext pulls in its text from separate files?
>
>
> So would the "parse" action of the API be the tool of choice?
>
>
> >  the WS Export tool can do that
>
>
> Thanks very much, will give that a shot next.
>
>
> Thank you,
>
> Julius
>
>
>
>
>
>
> On Tue, Sep 20, 2022 at 2:14 AM Sam Wilson <s...@samwilson.id.au> wrote:
>
>>
>>
>>
>> How can I get the full plaintext from an entire book on Wikisource with
>>> the API?
>>>
>>
>> Plaintext as in wikitext or in parsed html converted to plaintext?
>>
>>
>>
>> If it's the latter, the WS Export tool can do that:
>> https://ws-export.wmcloud.org/?format=txt
>>
>>
>> _______________________________________________
>> Mediawiki-api mailing list -- mediawiki-api@lists.wikimedia.org
>> To unsubscribe send an email to mediawiki-api-le...@lists.wikimedia.org
>>
>
_______________________________________________
Mediawiki-api mailing list -- mediawiki-api@lists.wikimedia.org
To unsubscribe send an email to mediawiki-api-le...@lists.wikimedia.org

Reply via email to