Pipelines and Rollout Store#

Pipelines

Pipelines in trlX provide a way to read from a dataset. They are used to fetch data from the dataset and feed it to the models for training or inference. The pipelines allow for efficient processing of the data and ensure that the models have access to the data they need for their tasks.

class trlx.pipeline.BasePipeline(path='dataset')[source]#
Parameters:

path (str) –

abstract create_loader(batch_size, shuffle, prep_fn=None, num_workers=0)[source]#

Create a dataloader for the pipeline

Parameters:
  • prep_fn (Optional[Callable]) – Typically a tokenizer. Applied to GeneralElement after collation.

  • batch_size (int) –

  • shuffle (bool) –

  • num_workers (int) –

Return type:

DataLoader

class trlx.pipeline.BaseRolloutStore(capacity=-1)[source]#
abstract create_loader(batch_size, shuffle, prep_fn=None, num_workers=0)[source]#

Create a dataloader for the rollout store

Parameters:
  • prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)

  • batch_size (int) –

  • shuffle (bool) –

  • num_workers (int) –

Return type:

DataLoader

abstract push(exps)[source]#

Push experiences to rollout storage

Parameters:

exps (Iterable[Any]) –

Rollout Stores

Rollout stores in trlX are used to store experiences created for the models by the orchestrator. The experiences in the rollout stores serve as the training data for the models. The models use the experiences stored in their rollout stores to learn and improve their behavior. The rollout stores provide a convenient and efficient way for the models to access the experiences they need for training.

PPO

class trlx.pipeline.ppo_pipeline.PPORolloutStorage(pad_token_id)[source]#

Rollout storage for training PPO

create_loader(batch_size, shuffle)[source]#

Create a dataloader for the rollout store

Parameters:
  • prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)

  • batch_size (int) –

  • shuffle (bool) –

Return type:

DataLoader

push(exps)[source]#

Push experiences to rollout storage

Parameters:

exps (Iterable[PPORLElement]) –

ILQL

class trlx.pipeline.offline_pipeline.PromptPipeline(prompts, max_prompt_length, tokenizer)[source]#

Tokenizes prompts, unless they are already tokenized, and truncates them to max_prompt_length from the right

Parameters:
  • prompts (List[str]) –

  • max_prompt_length (int) –

  • tokenizer (PreTrainedTokenizer) –

create_loader(batch_size, shuffle=False)[source]#

Create a dataloader for the pipeline

Parameters:
  • prep_fn – Typically a tokenizer. Applied to GeneralElement after collation.

  • batch_size (int) –

Return type:

DataLoader

class trlx.pipeline.offline_pipeline.ILQLRolloutStorage(input_ids, attention_mask, rewards, states_ixs, actions_ixs, dones)[source]#

Rollout storage for training ILQL

create_loader(batch_size, drop_last=True)[source]#

Create a dataloader for the rollout store

Parameters:
  • prep_fn (Callable) – Applied to RLElement after collation (typically tokenizer)

  • batch_size (int) –