Welcome to YFrake’s documentation!
Description
YFrake is a fast and flexible stock market, forex and cryptocurrencies data scraper and server 1. It enables developers to build powerful apps without having to worry about the details of session management or maximizing throughput 2.
YFrake has caching built in to speed up requests even more and to reduce load on the source servers. The cache and other YFrake options are fully customizable through the configuration file.
YFrake can be used as a client to directly return market data to the current program or as a programmatically controllable server to provide market data to other applications.
In addition, all network requests by the client in both sync and async modes are non-blocking, which means that your program can continue executing your code while network requests are in progress.
The best part about YFrake is its built-in swagger API documentation which you can use to perform test queries and examine the returned responses straight in your web browser.
YFrake is built upon the widely used aiohttp package and its plugins.
Getting Started
Install the package by executing:
pip install yfrake
Import the public objects with:
from yfrake import client, server, config
The client
, server
, and config
objects are singletons,
which have been instantiated internally beforehand to provide the user with lower-case object name identifiers.
NB! The minimum required Python version for YFrake is Python 3.10. From YFrake version 2.0.0 forward, trying to import YFrake in lower Python versions will raise a RuntimeError.
Endpoints
Count |
Endpoints |
Symbols |
---|---|---|
1 |
historical_prices |
stocks, forex, crypto |
2 |
quotes_overview |
stocks, forex, crypto |
3 |
quote_type |
stocks, forex, crypto |
4 |
news |
stocks, forex, crypto |
5 |
recommendations |
stocks, forex, crypto |
6 |
validate_symbols |
stocks, forex, crypto |
7 |
price_overview |
stocks, forex, crypto |
8 |
detailed_summary |
stocks, forex, crypto |
9 |
options |
stocks only |
10 |
insights |
stocks only |
11 |
esg_chart |
stocks only |
12 |
shares_outstanding |
stocks only |
13 |
esg_scores |
stocks only |
14 |
purchase_activity |
stocks only |
15 |
earnings |
stocks only |
16 |
calendar_events |
stocks only |
17 |
company_overview |
stocks only |
18 |
sec_filings |
stocks only |
19 |
financials |
stocks only |
20 |
recommendation_trend |
stocks only |
21 |
ratings_history |
stocks only |
22 |
earnings_history |
stocks only |
23 |
earnings_trend |
stocks only |
24 |
key_statistics |
stocks only |
25 |
income_statements |
stocks only |
26 |
cashflow_statements |
stocks only |
27 |
balance_statements |
stocks only |
28 |
institution_ownership |
stocks only |
29 |
fund_ownership |
stocks only |
30 |
major_holders |
stocks only |
31 |
insider_transactions |
stocks only |
32 |
insider_holders |
stocks only |
33 |
market_summary |
none |
34 |
trending_symbols |
none |
35 |
currencies |
none |
Caching
YFrake includes a fast in-memory TLRU cache for the client and the server objects to speed up consecutive identical requests to the same endpoints over a period of time. The default time-to-live (TTL) values have been found to be optimal through testing.
Caching can be disabled either individually for each endpoint by setting their TTL value to zero or in groups by enabling the group override setting and leaving the relevant group TTL value to zero.
This cache does not persist over program restarts. If the user desires to use something more permanent, it is suggested to use a library like diskcache.
Overview
Contents
Client Object
Methods
The client
singleton is the main object which is used to request data from the Yahoo Finance API servers.
It has three methods: the get
method, which is used to make a single request, the batch_get
helper method,
which is used to schedule multiple requests with one call, and the get_all
helper method, which requests data
about a single symbol from all symbol-specific endpoints at once.
Decorators
The client
object has a single decorator named session
, which opens a session to the Yahoo Finance API servers and
inspects the concurrency mode of your program to adjust its behaviour accordingly.
This enables YFrake to work in async and sync (threaded) modes out-of-the-box.
A function or a coroutine must be decorated with this decorator before any calls to the client
methods are made.
Calls to the client
methods do not have to take place inside the same function or coroutine which was decorated.
For simplicity’s sake, it is recommended to decorate the main
function or coroutine of your program,
so the session is opened on program start and closed when the program ends, but in essence any function
or a coroutine can be used, as long as the before-mentioned considerations are taken into account.
The best practice is to have your program activate the decorator only once, because repeatedly opening and closing the session will kill your performance.
Note: On Windows machines, the decorator automatically sets the asyncio event loop policy to
WindowsSelectorEventLoopPolicy, because the default WindowsProactorEventLoopPolicy does not work correctly.
This automatic selection works only when the decorated coroutine of your program is the main
coroutine,
which gets passed into the asyncio.run()
function.
ClientResponse Object
Instances of this object are returned by the client.get
method.
It handles the request and contains the response from the Yahoo Finance API servers
in three properties: endpoint
, error
and data
.
The endpoint
is a string, while the error
and data
can be either dictionaries or None.
If the request returned with an error, the error
property is a dictionary and the data
property is None.
If the request returned with data, then the data
property is a dictionary and the error
property is None.
This allows the developer to easily check for response status by writing if resp.error is None:
.
It has methods to (a)wait for the response and to check its completion status and also
two properties, event
and future
, to access the low-level internals of the ClientResponse
object.
Async- and ThreadResults Object
Instances of these objects, which are returned by the client.batch_get
and the client.get_all
methods,
are a list-like containers of ClientResponse
objects with additional functionality attached on top.
There are two kinds of results objects: AsyncResults
and ThreadResults
. Which one is returned depends
on the concurrency mode of the program. AsyncResults
is returned when the program is running in
async mode and the ThreadResults
is returned when the program is running in sync (threaded) mode.
The results objects can be used with the len()
and list()
functions and the subscript operator []
.
They have methods to (a)wait for the requests and to check their completion statuses and also
generators to iterate over the ClientResponse
objects in a for
or an async for
loop.
These generators guarantee that the objects which they yield into the for
loop have finished their request to the servers.
You can also loop over a results object with for resp in results
, but the returned objects are not guaranteed to be in a finished state,
unless you have specifically (a)waited the results object beforehand.
Reference
Client Reference
Public Decorators
- @session
- Manages the network connection to the Yahoo Finance API servers.Needs to be active only when the client methods are being called.Used internally by the YFrake server process.
- Raises
RuntimeError – if a configuration is already active.
Public Methods
- classmethod get(endpoint, **kwargs)
- Schedules a request to be made to the Yahoo Finance servers.Returns immediately with the pending response object.
- Parameters
endpoint (str) – The name of the endpoint from which to request data.
kwargs (unpacked dict) – Variable keyword arguments, which depend on the endpoint requirements. Values can be either str, int or bool.
- Raises
RuntimeError – if the session decorator is not in use.
NameError – if an invalid endpoint name has been provided.
KeyError – if an invalid query parameter has been provided.
TypeError – if the datatype of a query parameter is invalid.
- Returns
Response object
- Return type
ClientResponse
- classmethod batch_get(queries)
- Helper method which schedules multiple queries at once.Returns immediately with the pending results object.
- Parameters
queries (list) – Collection of query dicts.
- Raises
RuntimeError – if the session decorator is not in use.
NameError – if an invalid endpoint name has been provided.
KeyError – if an invalid query parameter has been provided.
TypeError – if the datatype of a query parameter is invalid.
- Returns
List-like collection object
- Return type
AsyncResults or ThreadResults
- classmethod get_all(symbol)
- Helper method which schedules a request to all symbol-specificendpoints for a given symbol at once. A single call results in32 simultaneous requests to the Yahoo Finance API servers.Size of the returned data can vary from 1 to 1.5 megabytes.Returns immediately with the pending results object.
- Parameters
symbol (str) – Security identifier.
- Raises
RuntimeError – if the session decorator is not in use.
NameError – if an invalid endpoint name has been provided.
KeyError – if an invalid query parameter has been provided.
TypeError – if the datatype of a query parameter is invalid.
- Returns
List-like collection object
- Return type
AsyncResults or ThreadResults
ClientResponse Reference
Contents
Public Methods
- pending()
- Checks if the request has completed by calling the
is_set()
method on theinternal event object. ReturnsTrue
if the request is still in progress.- Returns
Request completion status
- Return type
bool
- wait()
- In async mode, returns the
wait()
coroutine of the internal asyncio.Event object.In sync (threaded) mode, calls thewait()
method on the internal threading.Event object.- Returns
Awaitable coroutine or None
- Return type
Coroutine or None
API Response Properties
- property endpoint
- Provides access to the endpoint name of the response.
- Raises
RuntimeError – on property modification or deletion.
- Returns
Name of the endpoint.
- Return type
str
- property error
- Provides access to the error dictionary of the response.
- Raises
RuntimeError – on property modification or deletion.
- Returns
Error dict, if there was an error, or None.
- Return type
dict or None
- property data
- Provides access to the data dictionary of the response.
- Raises
RuntimeError – on property modification or deletion.
- Returns
Data dict, if there weren’t any errors, or None.
- Return type
dict or None
Internal Request Properties
- property event
- Provides access to the internal request completion event object.Return type depends on the concurrency mode of the program.In most cases, manual usage of this object is unnecessary.Disclaimer: Incorrect usage of this object can break things.
- Raises
RuntimeError – on property modification or deletion.
- Returns
Reference to the internal event object.
- Return type
asyncio.Event in async mode
- Return type
threading.Event in sync (threaded) mode
- property future
- Provides access to the internal future-like request object.Return type depends on the concurrency mode of the program.In most cases, manual usage of this object is unnecessary.Disclaimer: Incorrect usage of this object can break things.
- Raises
RuntimeError – on property modification or deletion.
- Returns
Reference to the internal future-like object.
- Return type
asyncio.Task in async mode
- Return type
concurrent.futures.Future in sync (threaded) mode
AsyncResults Reference
Public Methods
- pending()
Function which checks the completion statuses of all its requests by calling the
pending()
method on eachClientResponse
. ReturnsTrue
if at least one request is still in progress.- Returns
Request completion status
- Return type
bool
Public Coroutines
- async wait()
- Awaits until all its requests have completed.
- Returns
None
- async gather()
- Asynchronous generator which can be used in the
async for
loop.Awaits and starts yielding results when all requests have completed.- Returns
Request response objects
- Return type
ClientResponse
- async as_completed()
- Asynchronous generator which can be used in the
async for
loop.Awaits and starts yielding results immediately as they become available.- Returns
Request response objects
- Return type
ClientResponse
ThreadResults Reference
Contents
Public Methods
- pending()
Function which checks the completion statuses of all its requests by calling the
pending()
method on eachClientResponse
. ReturnsTrue
if at least one request is still in progress.- Returns
Request completion status
- Return type
bool
- wait()
- Waits until all its requests have completed.
- Returns
None
- gather()
- Synchronous generator which can be used in the
for
loop.Waits for and starts yielding results when all requests have completed.- Returns
Request response objects
- Return type
ClientResponse
- as_completed()
- Synchronous generator which can be used in the
for
loop.Waits for and starts yielding results immediately as they become available.- Returns
Request response objects
- Return type
ClientResponse
Examples
Async Mode Examples
Contents
Client.get() Examples
The following example loops at line 4 while
the response has not yet arrived:
1@client.session
2async def main():
3 resp = client.get('quote_type', symbol='msft')
4 while resp.pending():
5 # do some other stuff
The following example blocks at line 4 until the response has arrived:
1@client.session
2async def main():
3 resp = client.get('quote_type', symbol='msft')
4 await resp.wait()
5 # do some other stuff
Client.batch_get() Examples
The following example waits until all of the responses have arrived before running the async for
loop:
1@client.session
2async def main():
3 queries = [
4 dict(endpoint='quote_type', symbol='msft'),
5 dict(endpoint='price_overview', symbol='aapl'),
6 dict(endpoint='key_statistics', symbol='tsla')
7 ]
8 results = client.batch_get(queries)
9 async for resp in results.gather():
10 # do some stuff with the resp
The following example starts yielding the responses into the async for
loop as soon as they become available:
1@client.session
2async def main():
3 queries = [
4 dict(endpoint='quote_type', symbol='msft'),
5 dict(endpoint='price_overview', symbol='aapl'),
6 dict(endpoint='key_statistics', symbol='tsla')
7 ]
8 results = client.batch_get(queries)
9 async for resp in results.as_completed():
10 # do some stuff with the resp
Client.get_all() Examples
The following example loops while all the available data about a symbol is being retrieved:
1@client.session
2async def main():
3 results = client.get_all(symbol='msft')
4 while results.pending():
5 # do some other stuff
The following example blocks while all the available data about a symbol is being retrieved:
1@client.session
2async def main():
3 results = client.get_all(symbol='aapl')
4 await results.wait()
5 # do some other stuff
WARNING: A single call to get_all()
creates 32 simultaneous network requests and
can return up to 1.5 megabytes of data, so uncontrolled usage of this method
may deplete the memory of your system and may get your IP blacklisted by Yahoo.
Sync (Threaded) Mode Examples
Contents
Client.get() Examples
The following example loops at line 4 while
the response has not yet arrived:
1@client.session
2def main():
3 resp = client.get('quote_type', symbol='msft')
4 while resp.pending():
5 # do some other stuff
The following example blocks at line 4 until the response has arrived:
1@client.session
2def main():
3 resp = client.get('quote_type', symbol='msft')
4 resp.wait()
5 # do some other stuff
Client.batch_get() Examples
The following example waits until all of the responses have arrived before running the for
loop:
1@client.session
2def main():
3 queries = [
4 dict(endpoint='quote_type', symbol='msft'),
5 dict(endpoint='price_overview', symbol='aapl'),
6 dict(endpoint='key_statistics', symbol='tsla')
7 ]
8 results = client.batch_get(queries)
9 for resp in results.gather():
10 # do some stuff with the resp
The following example starts yielding the responses into the for
loop as soon as they become available:
1@client.session
2def main():
3 queries = [
4 dict(endpoint='quote_type', symbol='msft'),
5 dict(endpoint='price_overview', symbol='aapl'),
6 dict(endpoint='key_statistics', symbol='tsla')
7 ]
8 results = client.batch_get(queries)
9 for resp in results.as_completed():
10 # do some stuff with the resp
Client.get_all() Examples
The following example loops while all the available data about a symbol is being retrieved:
1@client.session
2def main():
3 results = client.get_all(symbol='msft')
4 while results.pending():
5 # do some other stuff
The following example blocks while all the available data about a symbol is being retrieved:
1@client.session
2def main():
3 results = client.get_all(symbol='aapl')
4 results.wait()
5 # do some other stuff
WARNING: A single call to get_all()
creates 32 simultaneous network requests and
can return up to 1.5 megabytes of data, so uncontrolled usage of this method
may deplete the memory of your system and may get your IP blacklisted by Yahoo.
Various Examples
The following example prints out the names of all the endpoints queried:
1from yfrake import client
2import asyncio
3
4@client.session
5async def main():
6 results = client.get_all(symbol='msft')
7 async for resp in results.gather():
8 print(f'Endpoint: {resp.endpoint}')
9
10if __name__ == '__main__':
11 asyncio.run(main())
The following example prints out either the error
or the data
property of the ClientResponse
objects:
1from yfrake import client
2import asyncio
3
4@client.session
5async def main():
6 queries = [
7 dict(endpoint='quote_type', symbol='msft'),
8 dict(endpoint='price_overview', symbol='gme_to_the_moon'),
9 dict(endpoint='key_statistics', symbol='tsla')
10 ]
11 results = client.batch_get(queries)
12 await results.wait()
13 for resp in results:
14 if resp.error:
15 print(f'Error: {resp.error}')
16 else:
17 print(f'Data: {resp.data}')
18
19if __name__ == '__main__':
20 asyncio.run(main())
The following example creates a batch request of 3 endpoints for 3 symbols:
1from yfrake import client
2
3@client.session
4def main():
5 all_queries = list()
6 for symbol in ['msft', 'aapl', 'tsla']:
7 queries = [
8 dict(endpoint='quote_type', symbol=symbol),
9 dict(endpoint='price_overview', symbol=symbol),
10 dict(endpoint='key_statistics', symbol=symbol)
11 ]
12 all_queries.extend(queries)
13
14 results = client.batch_get(all_queries)
15 results.wait()
16
17 count = len(results)
18 print(f'ClientResponse objects: {count}') # 9
19
20if __name__ == '__main__':
21 main()
The following example demonstrates the usage of the get
method inside a non-decorated function (or coroutine):
1from yfrake import client
2
3def make_the_request(symbol):
4 resp = client.get('quote_type', symbol=symbol)
5 resp.wait()
6 return resp
7
8@client.session
9def main():
10 resp = make_the_request('msft')
11 print(f'Data: {resp.data}')
12
13if __name__ == '__main__':
14 main()
Overview
The standardized interface of the YFrake server simplifies the process of acquiring stock market data for other applications, which can use their own networking libraries to make web requests to the YFrake server.
There are two ways how you can run the server: you can either control it from within your Python program
through the server
singleton or you can directly call the YFrake module in the terminal with python -m yfrake args
.
When running the server from the terminal without any args, then nothing will happen.
The optional args are --run-server
and --config-file /path
, which can be used independently from each other.
The arg --config-file
accepts as its only parameter either a full path to the config file or the special keyword here
,
which will have the server look for the config file in the Current Working Directory.
When using the keyword here
, if the file does not exist, it will be created with the default settings.
If the parameter is a full path to a config file, then the file must exist, otherwise an exception will be thrown.
In all cases, the config file must be named yfrake_settings.ini
.
When --run-server
is used without the --config-file
arg, then the server is run with the default settings.
Using --config-file here
without the --run-server
arg is useful for getting a copy of the config file with the default settings to the CWD.
You can access the built-in Swagger documentation by running the server and
navigating to the servers root address in your web browser (default: http://localhost:8888
).
You can perform queries to the endpoints either directly through the Swagger Docs UI, or by navigating to the appropriate URL-s in the address bar of your web browser.
When accessing endpoints through their URL-s, each endpoint has a path name like /market_summary
.
To request data from that endpoint, in your address bar you would write: http://localhost:8888/market_summary
.
If an endpoint like /company_overview
requires query parameters, then you would write in your address bar:
http://localhost:8888/company_overview?symbol=msft
.
Reference
- classmethod server.start()
Starts the YFrake server. Only one server can be active per process at any time.
- Raises
RuntimeError – if the server is already running.
- Returns
None
- classmethod server.stop()
Stops the YFrake server.
- Raises
RuntimeError – if the server is already stopped.
- Returns
None
- classmethod server.is_running()
Checks if the server is running.
- Returns
Server status
- Return type
bool
Examples
Running the server programmatically:
1from yfrake import server
2
3if not server.is_running()
4 server.start()
5
6# do other stuff
7
8if server.is_running()
9 server.stop()
$ python -m yfrake --config-file here
$ python -m yfrake --run-server
$ python -m yfrake --run-server --config-file here
$ python -m yfrake --run-server --config-file "/path/to/'yfrake_settings.ini"
Overview
Configuration settings for YFrake are stored in a file named yfrake_settings.ini
.
The config
singleton reads the settings from that file and configures the client
and the server
objects.
It is not necessary to use the config
object, if you want to run YFrake with the default settings.
The config has two properties named file
and settings
and one method named is_locked
,
which is used to check if the configuration is locked, i.e., the client.session
decorator is in use (active).
All the properties of the config
object can be read at any time, but the file
property
can be modified only when the client.session
decorator is not in use (active).
The file
property can accept either a pathlib.Path or a string object, which contains a full path to a config file.
Modifying the file
property after the server
has started has undefined behaviour and is therefore not recommended.
Accessing the settings
property will return a dictionary of the currently loaded configuration.
Modifying this dictionary does not modify the currently loaded configuration.
The config
object also has an attribute named HERE
, which points to an abstract config file in the Current Working Directory.
Assigning the HERE
attribute to the file
property will create the config file in the CWD with the default settings, if it doesn’t exist.
Reference
Contents
Public Methods
- classmethod is_locked()
- Helper method which is used to check if the configuration isbeing used by the
client.session
decorator. Any attemptto change the configuration while the session is open will causeaRuntimeError
to be thrown.- Returns
Value of the config lock status.
- Return type
bool
Public Properties
- class property file
- The full path to the configuration file which should be used by the client and the server objects.Can be assigned either a
pathlib.Path
or astr
object.- Raises
TypeError – on attempt to delete the property.
- Returns
Full path to the config file to be used.
- Return type
pathlib.Path
- class property settings
- Deep copied dictionary of the currently loaded configuration.This property is READ ONLY.
- Raises
TypeError – on attempt to modify the property.
TypeError – on attempt to delete the property.
- Return type
dict
Examples
Correct Usage Examples
1from yfrake import client
2
3@client.session
4def main():
5 # do stuff
6
7main()
1from yfrake import client, config
2
3config.file = config.HERE
4
5@client.session
6def main():
7 # do stuff
8
9main()
Assigning a custom config file in the specified path:
1from yfrake import client, config
2
3config.file = "C:/Users/username/Projects/Project Name/yfrake_settings.ini"
4
5@client.session
6def main():
7 # do stuff
8
9main()
Reading the currently loaded configuration settings:
1from yfrake import client, config
2
3settings = config.settings # correct
4
5@client.session
6def main():
7 settings = config.settings # also correct
8
9main()
Assigning a custom config file before the server is started:
1from yfrake import server, config
2
3config.file = Path("C:/Users/username/Projects/Project Name/yfrake_settings.ini")
4server.start()
5
6# defined behaviour
7
8server.stop()
Incorrect Usage Examples
Trying to assign a custom config file in the Current Working Directory.
1from yfrake import client, config
2
3@client.session
4def main():
5 config.file = config.HERE
6
7 # will raise an exception
8
9main()
Trying to assign a custom custom config file in the specified path:
1from yfrake import client, config
2
3@client.session
4def main():
5 config.file = "C:/Users/username/Projects/Project Name/yfrake_settings.ini"
6
7 # will raise an exception
8
9main()
Assigning a custom config file after the server has started:
1from yfrake import server, config
2
3server.start()
4config.file = Path("C:/Users/username/Projects/Project Name/yfrake_settings.ini")
5
6# undefined behaviour
7
8server.stop()
Config File
Contents
Description
TTL time values are integer seconds. All settings in the config file affect the client and the server behaviour both, except those in the SERVER section, which affect only the behaviour of the server.