A URI represents a (parsed) URI. URI supports RFC
3986 (URI Generic Syntax), and can parse any valid URI. However, libsoup only uses "http" and "https" URIs internally; You can use
SOUP_URI_VALID_FOR_HTTP to test if a URI is a valid HTTP URI.
scheme will always be set in any URI. It is an interned string and is always all lowercase. (If you parse a URI with a
non-lowercase scheme, it will be converted to lowercase.) The macros SOUP_URI_SCHEME_HTTP and SOUP_URI_SCHEME_HTTPS
provide the interned values for "http" and "https" and can be compared against URI scheme values.
user and password are parsed as defined in the older URI specs (ie, separated by a colon; RFC 3986 only talks
about a single "userinfo" field). Note that password is not included in the output of
to_string. libsoup does not normally use these fields; authentication is handled via
Session signals.
host contains the hostname, and port the port specified in the URI. If the URI doesn't contain a hostname,
host will be null, and if it doesn't specify a port, port may be 0. However, for "http" and "https"
URIs, host is guaranteed to be non-%NULL (trying to parse an http URI with no host will return null
), and port will always be non-0 (because libsoup knows the default value to use when it is not specified in the URI).
path is always non-%NULL. For http/https URIs, path will never be an empty string either; if the input URI has
no path, the parsed URI will have a path of "/".
query and fragment are optional for all URI types. decode may
be useful for parsing query.
Note that path, query, and fragment may contain %-encoded characters.
URI calls normalize on
them, but not decode. This is necessary to ensure that
to_string will generate a URI that has exactly the same meaning as the original. (In
theory, URI should leave user, password, and host
partially-encoded as well, but this would be more annoying than useful.)